Build \'For you\' recommendations using AI on Fastly!
Forget the hype; where is AI delivering real value? Let's use edge computing to harness the power of AI and make smarter user experiences that are also fast, safe and reliable.
Recommendations are everywhere, and everyone knows that making web experiences more personalized makes them more engaging and successful. My Amazon homepage knows that I like home furnishings, kitchenware and right now, summer clothing:
Today, most platforms make you choose between either being fast or being personalized. At Fastly, we think you — and your users — deserve to have both. If every time your web server generates a page, it is only suitable for one end user, you can't benefit from caching it, which is what edge networks like Fastly do well.
So how can you benefit from edge caching, and yet make content personalized? We've written a lot before about how to break up complex client requests into multiple smaller, cacheable backend requests, and you'll find tutorials, code examples and demos in the personalization topic on our developer hub.
But what if you want to go further and generate the personalisation data at the edge? The "edge" - the Fastly servers handling your website's traffic, are the closest point to the end user that's still within your control. A great place to produce content that's specific to one user.
The "For you" use case
Product recommendations are inherently transient, specific to an individual user and likely to change frequently. But they also don't need to persist - we don't typically need to know what we've recommended to each person, only whether a particular algorithm achieves better conversion than another. Some recommendation algorithms need access to a large amount of state data, like what users are most similar to you and their purchase or rating history, but often that data is easy to pregenerate in bulk.
Basically, generating recommendations usually doesn't create a transaction, doesn't need any locks in your data store, and makes use of input data that's either immediately available from the current user's session, or created in an offline build process.
Sounds like we can generate recommendations at the edge!
A real world example
Let's take a look at the website of the New York Metropolitan Museum of Art:
Each of the 500,000 or so objects in the Met's collection has a page with a picture and information about it. It also has this list of related objects:
This seems to use a fairly straightforward system of faceting to generate these relationships, showing me other artworks by the same artist, or other objects in the same wing of the museum, or which are also made of paper or originate in the same time period.
The nice thing about this system (from a developer perspective!) is that since it's only based on the one input object, it can be pre-generated into the page.
What if we want to augment this with a selection of recommendations that are based on the end user's personal browsing history as they navigate around the Met's website, not just based on this one object?
Adding personalized recommendations
There's lots of ways we can do this, but I wanted to try using a language model, since AI is happening right now, and it's really different from the way the Met's existing related artworks mechanism seems to work. Here's the plan:
- Download the Met's open access collection dataset.
- Run it through a language model to create vector embeddings – lists of numbers suitable for machine learning tasks.
- Build a performant similarity search engine for the resulting half a million vectors (representing the Met’s artworks) and load it into KV store so we can use it from Fastly Compute.
Once we've done all that, we should be able to, as you browse the Met's website:
- Track the artworks you visit in a cookie.
- Look up the vectors corresponding to those artworks.
- Calculate an average vector representing your browsing interests.
- Plug that into our similarity search engine to find the most similar artworks.
- Load details about those artworks from the Met's Object API and augment the page with personalized recommendations.
Et voilà, personalized recommendations:
OK, so let's break that down.
Creating the dataset
The Met's raw dataset is a CSV with lots of columns and looks like this:
Object Number,Is Highlight,Is Timeline Work,Is Public Domain,Object ID,Gallery Number,Department,AccessionYear,Object Name,Title,Culture,Period,Dynasty,Reign,Portfolio,Constituent ID,Artist Role,Artist Prefix,Artist Display Name,Artist Display Bio,Artist Suffix,Artist Alpha Sort,Artist Nationality,Artist Begin Date,Artist End Date,Artist Gender,Artist ULAN URL,Artist Wikidata URL,Object Date,Object Begin Date,Object End Date,Medium,Dimensions,Credit Line,Geography Type,City,State,County,Country,Region,Subregion,Locale,Locus,Excavation,River,Classification,Rights and Reproduction,Link Resource,Object Wikidata URL,Metadata Date,Repository,Tags,Tags AAT URL,Tags Wikidata URL 1979.486.1,False,False,False,1,,The American Wing,1979,Coin,One-dollar Liberty Head Coin,,,,,,16429,Maker," ",James Barton Longacre,"American, Delaware County, Pennsylvania 1794–1869 Philadelphia, Pennsylvania"," ","Longacre, James Barton",American,1794 ,1869 ,,http://vocab.getty.edu/page/ulan/500011409,https://www.wikidata.org/wiki/Q3806459,1853,1853,1853,Gold,Dimensions unavailable,"Gift of Heinz L. Stoppelmann, 1979",,,,,,,,,,,,,,http://www.metmuseum.org/art/collection/search/1,,,"Metropolitan Museum of Art, New York, NY",,, 1980.264.5,False,False,False,2,,The American Wing,1980,Coin,Ten-dollar Liberty Head Coin,,,,,,107,Maker," ",Christian Gobrecht,1785–1844," ","Gobrecht, Christian",American,1785 ,1844 ,,http://vocab.getty.edu/page/ulan/500077295,https://www.wikidata.org/wiki/Q5109648,1901,1901,1901,Gold,Dimensions unavailable,"Gift of Heinz L. Stoppelmann, 1980",,,,,,,,,,,,,,http://www.metmuseum.org/art/collection/search/2,,,"Metropolitan Museum of Art, New York, NY",,,
Simple enough to transform that into two columns, an ID and a string:
id,description 1,"One-dollar Liberty Head Coin; Type: Coin; Artist: James Barton Longacre; Medium: Gold; Date: 1853; Credit: Gift of Heinz L. Stoppelmann, 1979" 2,"Ten-dollar Liberty Head Coin; Type: Coin; Artist: Christian Gobrecht; Medium: Gold; Date: 1901; Credit: Gift of Heinz L. Stoppelmann, 1980" 3,"Two-and-a-Half Dollar Coin; Type: Coin; Medium: Gold; Date: 1927; Credit: Gift of C. Ruxton Love Jr., 1967"
Now we can use the transformers package from Hugging Face AI toolset, and generate embeddings of each of these descriptions. We used the sentence-transformers/all-MiniLM-L12-v2 model, and used principal component analysis (PCA) to reduce the resulting vectors to 5 dimensions. That gives you something like:
[ { "id": 1, "vector": [ -0.005544120445847511, -0.030924081802368164, 0.008597176522016525, 0.20186401903629303, 0.0578165128827095 ] }, { "id": 2, "vector": [ -0.005544120445847511, -0.030924081802368164, 0.008597176522016525, 0.20186401903629303, 0.0578165128827095 ] }, … ]
We have half a million of these, so it's not possible to store this entire dataset within the edge app's memory. And we want to do a custom type of similarity search over this data, which is something a traditional key-value store doesn't offer. Since we’re building a real-time experience, we also really want to avoid having to search half a million vectors at a time.
So, let's partition the data. We can use KMeans clustering to group vectors that are similar to each other. We sliced the data into 500 clusters of varying sizes, and calculated a center point called a “centroid vector” for each of those clusters. If you plotted this vector space in two dimensions and zoomed in, it might look a bit like this:
The red crosses are the mathematical center points of each cluster of vectors, called centroids. They can work like wayfinders for our half-million-vector space. For instance, if we want to find the 10 most similar vectors to a given vector A, we can first look for the nearest centroid (out of 500), then conduct our search only within its corresponding cluster–a much more manageable area!
Now we have 500 small datasets and an index that maps centroid points to the relevant dataset. Next, to enable real-time performance, we want to precompile search graphs so that we don't need to initialize and construct them at runtime, and can use as little CPU time as possible. A really fast nearest-neighbor algorithm is Hierarchical Navigable Small Worlds (HNSW), and it has a pure Rust implementation, which we're using to write our edge app. So we wrote a small standalone Rust app to construct the HNSW graph structs for each dataset, and then used bincode to export the memory of the instantiated struct into a binary blob.
Now, those binary blobs can be loaded into KV store, keyed against the cluster index, and the cluster index can be included in our edge app.
This architecture lets us load parts of the search index into memory on demand. And since we’ll never have to search more than a few thousand vectors at a time, our searches will always be cheap and fast.
Building the edge app
The application that we run at the edge needs to handle several types of requests:
-
HTML pages: We fetch these from metmuseum.org and transform the response to add extra front-end
Hot AI Tools
Undresser.AI Undress
AI-powered app for creating realistic nude photos
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undress AI Tool
Undress images for free
Clothoff.io
AI clothes remover
Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!
Hot Article
Roblox: Grow A Garden - Complete Mutation Guide4 weeks ago By DDDRoblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌Nordhold: Fusion System, Explained1 months ago By 尊渡假赌尊渡假赌尊渡假赌Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌Clair Obscur: Expedition 33 - How To Get Perfect Chroma Catalysts2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌Hot Tools
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
Java Tutorial1677
14
CakePHP Tutorial1430
52
Laravel Tutorial1333
25
PHP Tutorial1278
29
C# Tutorial1257
24

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

The main uses of JavaScript in web development include client interaction, form verification and asynchronous communication. 1) Dynamic content update and user interaction through DOM operations; 2) Client verification is carried out before the user submits data to improve the user experience; 3) Refreshless communication with the server is achieved through AJAX technology.

JavaScript's application in the real world includes front-end and back-end development. 1) Display front-end applications by building a TODO list application, involving DOM operations and event processing. 2) Build RESTfulAPI through Node.js and Express to demonstrate back-end applications.

Understanding how JavaScript engine works internally is important to developers because it helps write more efficient code and understand performance bottlenecks and optimization strategies. 1) The engine's workflow includes three stages: parsing, compiling and execution; 2) During the execution process, the engine will perform dynamic optimization, such as inline cache and hidden classes; 3) Best practices include avoiding global variables, optimizing loops, using const and lets, and avoiding excessive use of closures.

Both Python and JavaScript's choices in development environments are important. 1) Python's development environment includes PyCharm, JupyterNotebook and Anaconda, which are suitable for data science and rapid prototyping. 2) The development environment of JavaScript includes Node.js, VSCode and Webpack, which are suitable for front-end and back-end development. Choosing the right tools according to project needs can improve development efficiency and project success rate.

C and C play a vital role in the JavaScript engine, mainly used to implement interpreters and JIT compilers. 1) C is used to parse JavaScript source code and generate an abstract syntax tree. 2) C is responsible for generating and executing bytecode. 3) C implements the JIT compiler, optimizes and compiles hot-spot code at runtime, and significantly improves the execution efficiency of JavaScript.

Python is more suitable for data science and automation, while JavaScript is more suitable for front-end and full-stack development. 1. Python performs well in data science and machine learning, using libraries such as NumPy and Pandas for data processing and modeling. 2. Python is concise and efficient in automation and scripting. 3. JavaScript is indispensable in front-end development and is used to build dynamic web pages and single-page applications. 4. JavaScript plays a role in back-end development through Node.js and supports full-stack development.

JavaScript is widely used in websites, mobile applications, desktop applications and server-side programming. 1) In website development, JavaScript operates DOM together with HTML and CSS to achieve dynamic effects and supports frameworks such as jQuery and React. 2) Through ReactNative and Ionic, JavaScript is used to develop cross-platform mobile applications. 3) The Electron framework enables JavaScript to build desktop applications. 4) Node.js allows JavaScript to run on the server side and supports high concurrent requests.
