mysql full text search techniques_PHP tutorial
<< Back to man.ChinaUnix.net
MySQL Reference Manual for version 4.1.0-alpha.
------ -------------------------------------------------- -----------------------
6.8 MySQL full-text search
As of 3.23.23, MySQL begins to support full-text indexing and search. The full-text index in MySQL is a FULLTEXT type index. FULLTEXT indexes are used on MyISAM tables and can be created on CHAR, VARCHAR or TEXT columns using ALTER TABLE or CREATE INDEX at or after CREATE TABLE. For large databases, it is very fast to load the data into a table without a FULLTEXT index and then use ALTER TABLE (or CREATE INDEX) to create the index. Loading data into a table that already has a FULLTEXT index will be very slow.
Full text search is completed through the MATCH() function.
mysql> CREATE TABLE articles (
-> id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> title VARCHAR(200),
-> body TEXT,
-> FULLTEXT (title,body)
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO articles VALUES
-> ( NULL,'MySQL Tutorial', 'DBMS stands for DataBase ...'),
-> (NULL,'How To Use MySQL Efficiently', 'After you went through a ...'),
-> (NULL,'Optimising MySQL','In this tutorial we will show ...'),
-> (NULL,'1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
-> (NULL,'MySQL vs. YourSQL', 'In the following database comparison ...'),
-> (NULL,'MySQL Security', 'When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body) AGAINST ('database');
+----+--------------------------------+---- -----------------------------------------------+
| id | title | body | 🎜>+----+-------------------+----------------------- --------------------+
| 5 | MySQL vs. YourSQL | In the following database comparison ... |
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+--------------------------------+------------- --------------------------------+
2 rows in set (0.00 sec)
Function MATCH( ) performs a natural language search for a string against a text set (a set of one or more columns contained in a FULLTEXT index). The search string is given as an argument to AGAINST(). The search is performed ignoring the case of letters. For each row in the table, MATCH() returns a correlation value. That is, the similarity measure between the search string and the text of the record row in the column specified in the MATCH() list.
When MATCH() is used in a WHERE clause (see the example above), the returned rows are automatically sorted in order of relevance from highest to lowest. Correlation values are non-negative floating point numbers. Zero correlation means no similarity. Relevance is calculated based on: the number of words in a row, the number of unique words in a row, the total number of words in a set, and the number of documents (rows) that contain a particular word.
It can also perform a logical pattern search. This is described in the following sections.
The previous examples are some basic instructions on the use of function MATCH(). Rows are returned in order of decreasing similarity.
The next example shows how to retrieve an explicit similarity value. If there is neither a WHERE nor an ORDER BY clause, the returned rows are not sorted.
mysql> SELECT id,MATCH (title,body) AGAINST ('Tutorial') FROM articles;
+----+------------- --------------------------+
| id | MATCH (title,body) AGAINST ('Tutorial') |
+ ----+-----------------------------------------+
| 1 | 0.64840710366884 |
| 2 | 0 ---+-----------------------------------------+
6 rows in set (0.00 sec)
The example below is a little more complicated. The query returns similarities and still returns rows in descending order of similarity. To accomplish this result, you should specify MATCH() twice. This causes no additional overhead because the MySQL optimizer will notice the same MATCH() call twice and call the full-text search code only once.
mysql> SELECT id, body, MATCH (title,body) AGAINST
-> ('Security implications of running MySQL as root') AS score
-> FROM articles WHERE MATCH ( title,body) AGAINST
-> ('Security implications of running MySQL as root');
+----+---------------- ------------------+------------------+
| id | body >+----+----------------------------------------+----- ------------+
| 4 | 1. Never run mysqld as root. 2. ... | 1.5055546709332 |
| 6 | When configured properly, MySQL ... | 1.31140957288 |
+----+----------------------------------------+- ----------------+
2 rows in set (0.00 sec)
MySQL uses a very simple parser to separate text into words. A "word" is any sequence of characters composed of text, data, "'" and "_". Any "word" that appears on the stopword list, or is too short (3 characters or less) will be ignored.
Each suitable word in the set and query is measured according to its importance in the set and query. This way, a word that appears in multiple documents will have a lower weight (perhaps even a zero weight) because it has lower semantic value in this particular set. Otherwise, if the word is less, it will get a higher weight. The weights of the words are then combined to calculate the similarity of the record lines.
Such a technique works well with large sets (in fact, it will be carefully harmonized with them). For very small tables, word classification does not adequately reflect their semantic value, and sometimes this model can produce strange results.
mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('MySQL');
Empty set (0.00 sec)
In the above example, the search term MySQL But got no results because this word appears in more than half of the recorded lines.As such, it is effectively treated as a stopword (i.e., a word with zero semantic value). This is ideal behavior - a natural language query should not return every second row from a 1GB table.
Words that match half of the rows in the table are unlikely to find relevant documents. In fact, it may find many irrelevant documents. We all know that this happens all the time when we are trying to search for something on the internet through a search engine. For this reason, in this particular dataset, such rows are set to a low semantic value.
As of 4.0.1, MySQL can also use the IN BOOLEAN MODE modifier to perform a logical full-text search.
mysql> SELECT * FROM articles WHERE MATCH (title,body)
-> AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+--- --------------------------+---------------------- ---------------+
| id | title ---------------+---------------------------------- ---+
| 1 | MySQL Tutorial | DBMS stands for DataBase... |
| 2 | How To Use MySQL Efficiently | After you went through a... |
| 3 | Optimizing MySQL | In this tutorial we will show ... |
| 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... |
| 6 | MySQL Security | When configured properly, MySQL. .. |
+----+----------------------------------+------- ----------------------------------+
This query returns all rows containing the word MySQL (Note: The 50% threshold is not used), but it does not contain the word YourSQL. Note that a logical pattern search does not automatically sort rows in descending order of similar values. As you can see from the above results, the highest similarity value (the one that contains MySQL twice) is listed last, not first. A logical full-text search can work even without a FULLTEXT index, however it is slower.
Logical full-text search supports the following operators:
+
A leading plus sign indicates that the word must appear in each returned record row.
-
A leading minus sign indicates that the word must not appear in each returned record row.
The default (when neither plus nor minus sign is specified) word is optional, but rows containing it will be ranked higher. This mimics the behavior of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier.
< >
These two operators are used to change the base value of the similarity value of a word. The < operator decrements the base value, and the > operator increases it. See the example below.
( )
Parentheses are used to group words in a subexpression.
~
A leading negation sign acts like a negation operator, causing row similarity words to have a negative base value. It is useful for labeling a noisy word. A record containing such a word will be ranked lower, but will not be completely excluded, because the - operator can be used.
*
An asterisk is the truncation operator. Unlike other operators, it should be appended to a word, not before it.
"
phrase, enclosed in double quotes ", matches only lines containing the phrase (literally, as if typed).
Here are some examples:
apple banana
Find rows that contain at least one of the above words
+apple +juice
... Both words are included
+apple macintosh
... Contains the word "apple", but if it also contains "macintosh", it will be ranked higher
+apple -macintosh
... Contains "apple" but does not contain "macintosh"
+apple +(>pie
apple*
... Contains "apple", "apples", "applesauce" and "applet"
"some words "
... can contain "some words of wisdom", but not "some noise words"
6.8.1 Full text restrictions
All parameters of the MATCH() function must come from the same table The columns must also be part of the same FULLTEXT index, unless MATCH() is IN BOOLEAN MODE.
MATCH() column list must exactly match the column list defined in a FULLTEXT index of the table, unless MATCH() is IN BOOLEAN MODE . The parameter of
AGAINST() must be a constant string.
6.8.2 Fine-tuning MySQL full-text search
Unfortunately, full-text search still has very few user-tunable parameters, although adding a few would rank high on the TODO. If you have a MySQL source distribution (see Section 2.3 Installing a MySQL Source Distribution), you can exert more control over full-text search.
Note that full-text search is carefully tuned for the best search results. Modifying the default behavior will, in most cases, only make search results worse. Do not modify MySQL source code unless you know what you are doing!
The minimum length of the indexed word is specified by the MySQL variable ft_min_word_len. See section 4.5.6.4 SHOW VARIABLES. Change it to your desired value and rebuild your FULLTEXT index. (This variable is only supported starting from MySQL 4.0) The
stopword list can be read from the file specified by the ft_stopword_file variable. See section 4.5.6.4 SHOW VARIABLES. After modifying the stopword list, rebuild your FULLTEXT index. (This variable is only supported starting from MySQL 4.0.10)
50% threshold selection is determined by the specific measurement mode selected. To disable it, modify the following line in the `myisam/ftdefs.h' file:
#define GWS_IN_USE GWS_PROB
to:
#define GWS_IN_USE GWS_FREQ
and then recompile MySQL. In this case, there is no need to rebuild the index. NOTE: Using this will severely reduce MySQL's ability to provide sufficient similarity values for MATCH(). If you really need to search for such a public word, it is better to use the IN BOOLEAN MODE search instead, which does not respect the 50% threshold.
Sometimes search engine maintainers wish to change the operators used for logical full-text searches. These are defined by the variable ft_boolean_syntax. See section 4.5.6.4 SHOW VARIABLES. However, this variable is read-only and its value is set in `myisam/ft_static.c'.
For these changes, you are required to rebuild your FULLTEXT index. For a MyISAM table, the easiest way to rebuild the index file is as follows:
mysql> REPAIR TABLE tbl_name QUICK;
6.8.3 Full text search TODO
Make all operations on FULLTEXT index faster
Proximity operator
Support for "always-index words". They can be any string the user wishes to treat as a word, such as "C++", "AS/400", "TCP/IP", etc.
Supports full-text search in MERGE table
Many Byte character support
Create a stopword list according to the language of the data
Stemming (depending on the language of the data, of course)
Generic user-suppliable UDF preparser.
Make the schema more flexible (by providing CREATE/ FULLTEXT in ALTER TABLE adds certain adjustable parameters)

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

The process of starting MySQL in Docker consists of the following steps: Pull the MySQL image to create and start the container, set the root user password, and map the port verification connection Create the database and the user grants all permissions to the database

Laravel is a PHP framework for easy building of web applications. It provides a range of powerful features including: Installation: Install the Laravel CLI globally with Composer and create applications in the project directory. Routing: Define the relationship between the URL and the handler in routes/web.php. View: Create a view in resources/views to render the application's interface. Database Integration: Provides out-of-the-box integration with databases such as MySQL and uses migration to create and modify tables. Model and Controller: The model represents the database entity and the controller processes HTTP requests.

I encountered a tricky problem when developing a small application: the need to quickly integrate a lightweight database operation library. After trying multiple libraries, I found that they either have too much functionality or are not very compatible. Eventually, I found minii/db, a simplified version based on Yii2 that solved my problem perfectly.

The key to installing MySQL elegantly is to add the official MySQL repository. The specific steps are as follows: Download the MySQL official GPG key to prevent phishing attacks. Add MySQL repository file: rpm -Uvh https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm Update yum repository cache: yum update installation MySQL: yum install mysql-server startup MySQL service: systemctl start mysqld set up booting

Installing MySQL on CentOS involves the following steps: Adding the appropriate MySQL yum source. Execute the yum install mysql-server command to install the MySQL server. Use the mysql_secure_installation command to make security settings, such as setting the root user password. Customize the MySQL configuration file as needed. Tune MySQL parameters and optimize databases for performance.

Article summary: This article provides detailed step-by-step instructions to guide readers on how to easily install the Laravel framework. Laravel is a powerful PHP framework that speeds up the development process of web applications. This tutorial covers the installation process from system requirements to configuring databases and setting up routing. By following these steps, readers can quickly and efficiently lay a solid foundation for their Laravel project.

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.
