首页 后端开发 php教程 优化大规模 API 数据检索:最佳实践和 PHP 延迟收集解决方案

优化大规模 API 数据检索:最佳实践和 PHP 延迟收集解决方案

Sep 12, 2024 pm 04:18 PM

Optimizing Large-Scale API Data Retrieval: Best Practices and PHP Lazy Collection Solution

When working with APIs to retrieve vast amounts of data—potentially thousands of items—there are several crucial aspects to consider, ensuring the process is efficient, flexible, and performant. Here’s a breakdown of the key factors to manage, along with a solution for PHP users.

Key considerations when retrieving large data via API

Let me share some key considerations for efficiently retrieving large datasets via API:

  • Handling pagination: APIs typically deliver data in pages. To retrieve all the data, you need to manage pagination, performing multiple API calls while keeping track of the cursor or page number. Calculating the number of required API calls and managing this process is essential to ensure you get the complete dataset.
  • Memory management: when fetching large datasets, loading everything into memory at once can overwhelm your system. It's crucial to avoid loading all results into memory at the same time. Instead, process data in chunks, ensuring your application remains responsive and doesn’t run into memory issues.
  • Rate limiting & throttling: many APIs impose rate limits, such as restricting you to X requests per second or Y requests per minute. To stay within these limits, you must implement a flexible throttling mechanism that adapts to the API's specific restrictions.
  • Parallel API requests: given the need to perform numerous API calls due to pagination, you want to retrieve data as quickly as possible. One strategy is to make multiple API calls in parallel, all while respecting the rate limits. This ensures that your requests are both fast and compliant with API constraints.
  • Efficient data collection: despite making numerous paginated API requests, you need to combine the results into a single collection, handling them efficiently to avoid memory overload. This ensures smooth processing of data while keeping resource usage low.
  • Optimized JSON parsing: many APIs return data in JSON format. When dealing with large responses, it's important to access and query specific sections of the JSON in a performant manner, ensuring that unnecessary data isn't loaded or processed.
  • Efficient exception handling: APIs typically raise exceptions through HTTP status codes, indicating issues like timeouts, unauthorized access, or server errors. It’s important to handle these using the exception mechanism provided by your programming language. Beyond basic error handling, you should also map and raise exceptions in a way that aligns with your application's logic, making the error handling process clear and manageable. Implementing retries, logging, and mapping errors to meaningful exceptions ensures a smooth and reliable data retrieval process.

The "Lazy JSON Pages" PHP Solution

If you're working with PHP, you're in luck. The Lazy JSON Pages open source package offers a convenient, framework-agnostic API scraper that can load items from paginated JSON APIs into a Laravel lazy collection via asynchronous HTTP requests. This package simplifies pagination, throttling, parallel requests, and memory management, ensuring efficiency and performance.

You can find more information about the package, and more options to customize it in the readme of the official GitHub repository: Lazy JSON Pages.

I want to say thank you to Andrea Marco Sartori the author of the package.

Example: Retrieving Thousands of Stories from Storyblok

Here’s a concise example of retrieving thousands of stories from Storyblok using the Lazy JSON Pages package in PHP.
First, you can create a new directory, jump into the directory and start installing the package:

mkdir lazy-http
cd lazy-http
composer require cerbero/lazy-json-pages
登录后复制

Once the package is installed, you can start creating your script:

<?php

require "./vendor/autoload.php";

use Illuminate\Support\LazyCollection;  
$token = "your-storyblok-access-token";
$version = "draft"; // draft or published

$source = "https://api.storyblok.com/v2/cdn/stories?token=" . $token . "&version=" . $version;
$lazyCollection = LazyCollection::fromJsonPages($source)
    ->totalItems('total')
    ->async(requests: 3)
    ->throttle(requests: 10, perSeconds: 1)
    ->collect('stories.*');

foreach ($lazyCollection as $item) {
    echo $item["name"] . PHP_EOL;
}
登录后复制

Then you can replace your access token, and execute the script via the php command.

它是如何运作的

  • 高效分页:API 结果分页,惰性集合处理获取所有页面,而不需要将所有内容存储在内存中。
  • 异步 API 调用:->async(requests: 3) 行并行触发三个 API 请求,提高性能。
  • 限制: ->throttle(requests: 10, perSeconds: 1) 行确保每秒发出的请求不超过 10 个,遵守速率限制。
  • 内存效率:使用惰性集合可以逐项处理数据,减少内存开销,即使对于大型数据集也是如此。

这种方法提供了可靠、高性能且内存高效的解决方案,用于从 PHP 中的 API 检索大量数据。

参考

  • Lazy JSON Pages 包:https://github.com/cerbero90/lazy-json-pages
  • 开源包作者:https://github.com/cerbero90

以上是优化大规模 API 数据检索:最佳实践和 PHP 延迟收集解决方案的详细内容。更多信息请关注PHP中文网其他相关文章!

本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

Video Face Swap

Video Face Swap

使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热门文章

<🎜>:泡泡胶模拟器无穷大 - 如何获取和使用皇家钥匙
4 周前 By 尊渡假赌尊渡假赌尊渡假赌
北端:融合系统,解释
4 周前 By 尊渡假赌尊渡假赌尊渡假赌
Mandragora:巫婆树的耳语 - 如何解锁抓钩
3 周前 By 尊渡假赌尊渡假赌尊渡假赌

热工具

记事本++7.3.1

记事本++7.3.1

好用且免费的代码编辑器

SublimeText3汉化版

SublimeText3汉化版

中文版,非常好用

禅工作室 13.0.1

禅工作室 13.0.1

功能强大的PHP集成开发环境

Dreamweaver CS6

Dreamweaver CS6

视觉化网页开发工具

SublimeText3 Mac版

SublimeText3 Mac版

神级代码编辑软件(SublimeText3)

热门话题

Java教程
1672
14
CakePHP 教程
1428
52
Laravel 教程
1333
25
PHP教程
1277
29
C# 教程
1257
24
说明PHP中的安全密码散列(例如,password_hash,password_verify)。为什么不使用MD5或SHA1? 说明PHP中的安全密码散列(例如,password_hash,password_verify)。为什么不使用MD5或SHA1? Apr 17, 2025 am 12:06 AM

在PHP中,应使用password_hash和password_verify函数实现安全的密码哈希处理,不应使用MD5或SHA1。1)password_hash生成包含盐值的哈希,增强安全性。2)password_verify验证密码,通过比较哈希值确保安全。3)MD5和SHA1易受攻击且缺乏盐值,不适合现代密码安全。

PHP类型提示如何起作用,包括标量类型,返回类型,联合类型和无效类型? PHP类型提示如何起作用,包括标量类型,返回类型,联合类型和无效类型? Apr 17, 2025 am 12:25 AM

PHP类型提示提升代码质量和可读性。1)标量类型提示:自PHP7.0起,允许在函数参数中指定基本数据类型,如int、float等。2)返回类型提示:确保函数返回值类型的一致性。3)联合类型提示:自PHP8.0起,允许在函数参数或返回值中指定多个类型。4)可空类型提示:允许包含null值,处理可能返回空值的函数。

PHP和Python:解释了不同的范例 PHP和Python:解释了不同的范例 Apr 18, 2025 am 12:26 AM

PHP主要是过程式编程,但也支持面向对象编程(OOP);Python支持多种范式,包括OOP、函数式和过程式编程。PHP适合web开发,Python适用于多种应用,如数据分析和机器学习。

PHP和Python:代码示例和比较 PHP和Python:代码示例和比较 Apr 15, 2025 am 12:07 AM

PHP和Python各有优劣,选择取决于项目需求和个人偏好。1.PHP适合快速开发和维护大型Web应用。2.Python在数据科学和机器学习领域占据主导地位。

您如何防止PHP中的SQL注入? (准备的陈述,PDO) 您如何防止PHP中的SQL注入? (准备的陈述,PDO) Apr 15, 2025 am 12:15 AM

在PHP中使用预处理语句和PDO可以有效防范SQL注入攻击。1)使用PDO连接数据库并设置错误模式。2)通过prepare方法创建预处理语句,使用占位符和execute方法传递数据。3)处理查询结果并确保代码的安全性和性能。

PHP:处理数据库和服务器端逻辑 PHP:处理数据库和服务器端逻辑 Apr 15, 2025 am 12:15 AM

PHP在数据库操作和服务器端逻辑处理中使用MySQLi和PDO扩展进行数据库交互,并通过会话管理等功能处理服务器端逻辑。1)使用MySQLi或PDO连接数据库,执行SQL查询。2)通过会话管理等功能处理HTTP请求和用户状态。3)使用事务确保数据库操作的原子性。4)防止SQL注入,使用异常处理和关闭连接来调试。5)通过索引和缓存优化性能,编写可读性高的代码并进行错误处理。

PHP的目的:构建动态网站 PHP的目的:构建动态网站 Apr 15, 2025 am 12:18 AM

PHP用于构建动态网站,其核心功能包括:1.生成动态内容,通过与数据库对接实时生成网页;2.处理用户交互和表单提交,验证输入并响应操作;3.管理会话和用户认证,提供个性化体验;4.优化性能和遵循最佳实践,提升网站效率和安全性。

在PHP和Python之间进行选择:指南 在PHP和Python之间进行选择:指南 Apr 18, 2025 am 12:24 AM

PHP适合网页开发和快速原型开发,Python适用于数据科学和机器学习。1.PHP用于动态网页开发,语法简单,适合快速开发。2.Python语法简洁,适用于多领域,库生态系统强大。

See all articles