Home Backend Development PHP Tutorial Use Crawler component to analyze HTML in laravel

Use Crawler component to analyze HTML in laravel

Aug 07, 2017 pm 05:10 PM
html laravel

This article mainly introduces the use of Symfony's Crawler component to analyze HTML in laravel. Friends in need can refer to it

The full name of Crawler is DomCrawler, which is a component of the Symfony framework. What is outrageous is that DomCrawler does not have Chinese documentation, and Symfony has not translated this part, so development using DomCrawler can only be explored bit by bit. Now I will summarize the experience in the use process.

The first thing is to install


composer require symfony/dom-crawler
composer require symfony/css-selector
Copy after login

css-seelctor is a css selector. Some functions will be used when selecting nodes with css

The example used in the manual is


use Symfony\Component\DomCrawler\Crawler;
$html = <<<‘HTML‘
Hello World!
Hello Crawler!
HTML;
$crawler = new Crawler($html);
foreach ($crawler as $domElement)
{
var_dump($domElement->nodeName);
}
Copy after login

The printed result is


string ‘html‘ (length=4)
Copy after login

because of this paragraph The nodeName of the html code is html. My English is not good. When I started using it, I thought the program was wrong. . .

In the actual use process, if new Crawler ($html) will have garbled characters, it should be related to the page encoding, so you can use the following method, first initialize the crawler, and then add node


$crawler = new Crawler();
$crawler->addHtmlContent($html);
Copy after login

The second parameter of addHtmlContent is charset, and the default is utf-8.

For other examples, please refer to the official documentation, http://symfony.com/doc/current/components/dom_crawler.html

Record the usages that you have tried little by little at work

filterXPath(string $xpath) method, according to the manual, the parameter of this method is $xpath, and p, p and other blocks are often used.


echo $crawler->filterXPath(‘//body/p‘)->text();
echo $crawler->filterXPath(‘//body/p‘)->last()->text();
Copy after login

The output is the text of the first and next p tag block


var_dump($crawler->filterXPath(‘//body‘)->html());
Copy after login

Output the html within the body


foreach ($crawler->filterXPath(‘//body/p‘) as $i => $node) {
$c = new Crawler($node);
echo $c->filter(‘p‘)->text();
}
Copy after login

filterXPath obtains an array of DOMElement blocks. Each DOMElement block can use a new crawler object to continue parsing


$nodeValues =
$crawler->filterXPath(‘//body/p‘)->each(function (Crawler $node, $i) {
return $node->text();
});
Copy after login

crawler provides each loop and uses closure functions to simplify the code. However, please note that this way of writing $nodeValues ​​results in an array, which requires further processing.

Other usage


##

echo $crawler->filterXPath(‘//body/p‘)->attr(‘class‘);
Copy after login

You can get the value "message" of the class attribute corresponding to the first p tag ”


$crawler->filterXPath(‘//p[@class="样式"]‘)->filter(‘a‘)->attr(‘href‘);
$crawler->filterXPath(‘//p[@class="样式"]‘)->filter(‘a>img‘)->extract(array(‘alt‘, ‘href‘))
Copy after login
The above are some methods of obtaining tag attributes

Filter and filterXPath are different. The manual says css selector. I don’t quite understand. I It is understood that the elements contained in XPath nodes such as p need to be tried in actual development.

Generally speaking, I feel that DomCrawler is easier to use than simple html dom. Maybe it is because I use it more easily.

The above are just the basic functions of Crawler. For more usage, please refer to the functions in the Crawler part of the symfony manual

http://api.symfony.com/3.2/Symfony/Component/DomCrawler/Crawler .html

The main problem with Crawler is that there are too few examples. There are no usage examples in the function manual, so you can only explore it in actual use. . . .

symfony's documentation about DomCrawler, which contains a few examples

http://symfony.com/doc/current/components/dom_crawler.html

The above is the detailed content of Use Crawler component to analyze HTML in laravel. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1664
14
PHP Tutorial
1266
29
C# Tutorial
1239
24
HTML: The Structure, CSS: The Style, JavaScript: The Behavior HTML: The Structure, CSS: The Style, JavaScript: The Behavior Apr 18, 2025 am 12:09 AM

The roles of HTML, CSS and JavaScript in web development are: 1. HTML defines the web page structure, 2. CSS controls the web page style, and 3. JavaScript adds dynamic behavior. Together, they build the framework, aesthetics and interactivity of modern websites.

Laravel Introduction Example Laravel Introduction Example Apr 18, 2025 pm 12:45 PM

Laravel is a PHP framework for easy building of web applications. It provides a range of powerful features including: Installation: Install the Laravel CLI globally with Composer and create applications in the project directory. Routing: Define the relationship between the URL and the handler in routes/web.php. View: Create a view in resources/views to render the application's interface. Database Integration: Provides out-of-the-box integration with databases such as MySQL and uses migration to create and modify tables. Model and Controller: The model represents the database entity and the controller processes HTTP requests.

Solve caching issues in Craft CMS: Using wiejeben/craft-laravel-mix plug-in Solve caching issues in Craft CMS: Using wiejeben/craft-laravel-mix plug-in Apr 18, 2025 am 09:24 AM

When developing websites using CraftCMS, you often encounter resource file caching problems, especially when you frequently update CSS and JavaScript files, old versions of files may still be cached by the browser, causing users to not see the latest changes in time. This problem not only affects the user experience, but also increases the difficulty of development and debugging. Recently, I encountered similar troubles in my project, and after some exploration, I found the plugin wiejeben/craft-laravel-mix, which perfectly solved my caching problem.

Laravel user login function Laravel user login function Apr 18, 2025 pm 12:48 PM

Laravel provides a comprehensive Auth framework for implementing user login functions, including: Defining user models (Eloquent model), creating login forms (Blade template engine), writing login controllers (inheriting Auth\LoginController), verifying login requests (Auth::attempt) Redirecting after login is successful (redirect) considering security factors: hash passwords, anti-CSRF protection, rate limiting and security headers. In addition, the Auth framework also provides functions such as resetting passwords, registering and verifying emails. For details, please refer to the Laravel documentation: https://laravel.com/doc

How to learn Laravel How to learn Laravel for free How to learn Laravel How to learn Laravel for free Apr 18, 2025 pm 12:51 PM

Want to learn the Laravel framework, but suffer from no resources or economic pressure? This article provides you with free learning of Laravel, teaching you how to use resources such as online platforms, documents and community forums to lay a solid foundation for your PHP development journey from getting started to master.

Laravel framework installation method Laravel framework installation method Apr 18, 2025 pm 12:54 PM

Article summary: This article provides detailed step-by-step instructions to guide readers on how to easily install the Laravel framework. Laravel is a powerful PHP framework that speeds up the development process of web applications. This tutorial covers the installation process from system requirements to configuring databases and setting up routing. By following these steps, readers can quickly and efficiently lay a solid foundation for their Laravel project.

What versions of laravel are there? How to choose the version of laravel for beginners What versions of laravel are there? How to choose the version of laravel for beginners Apr 18, 2025 pm 01:03 PM

In the Laravel framework version selection guide for beginners, this article dives into the version differences of Laravel, designed to assist beginners in making informed choices among many versions. We will focus on the key features of each release, compare their pros and cons, and provide useful advice to help beginners choose the most suitable version of Laravel based on their skill level and project requirements. For beginners, choosing a suitable version of Laravel is crucial because it can significantly impact their learning curve and overall development experience.

How to view the version number of laravel? How to view the version number of laravel How to view the version number of laravel? How to view the version number of laravel Apr 18, 2025 pm 01:00 PM

The Laravel framework has built-in methods to easily view its version number to meet the different needs of developers. This article will explore these methods, including using the Composer command line tool, accessing .env files, or obtaining version information through PHP code. These methods are essential for maintaining and managing versioning of Laravel applications.

See all articles