How to create a model from my data on Kaggle
This tutorial demonstrates how to use the FastAI library to train an image classification model to distinguish between cats and dogs. We'll go step by step, from data preparation to model training and usage.
Step 1: Data preparation
- Image search function: First, we define a function for searching images from the DuckDuckGo search engine. This function accepts keywords and the maximum number of images as input and returns a list of image URLs.
import os iskaggle = os.environ.get('KAGGLE_KERNEL_RUN_TYPE', '') if iskaggle: !pip install -Uqq fastai 'duckduckgo_search>=6.2' from duckduckgo_search import DDGS from fastcore.all import * import time, json def search_images(keywords, max_images=200): return L(DDGS().images(keywords, max_results=max_images)).itemgot('image')
- Search and download sample images: Let’s search for “dog photos” and “cat photos” respectively and download a sample image.
urls = search_images('dog photos', max_images=1) from fastdownload import download_url dest = 'dog.jpg' download_url(urls[0], dest, show_progress=False) from fastai.vision.all import * im = Image.open(dest) im.to_thumb(256,256)
Similarly, we download a picture of a cat:
download_url(search_images('cat photos', max_images=1)[0], 'cat.jpg', show_progress=False) Image.open('cat.jpg').to_thumb(256,256)
- Batch download and pre-process images: We download multiple pictures of cats and dogs and save them into
dog_or_not/dog
anddog_or_not/cat
folders respectively. At the same time, we resize the image to improve efficiency.
searches = 'dog', 'cat' path = Path('dog_or_not') for o in searches: dest = (path/o) dest.mkdir(exist_ok=True, parents=True) download_images(dest, urls=search_images(f'{o} photo')) time.sleep(5) resize_images(path/o, max_size=400, dest=path/o)
- Clean invalid images: Delete images that failed to download or are damaged.
failed = verify_images(get_image_files(path)) failed.map(Path.unlink)
Step 2: Model training
- Create DataLoader: Use
DataBlock
to create DataLoader for loading and processing image data.
dls = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, splitter=RandomSplitter(valid_pct=0.2, seed=42), get_y=parent_label, item_tfms=[Resize(192, method='squish')] ).dataloaders(path, bs=32) dls.show_batch(max_n=6)
- Fine-tuning the pre-trained model: Use a pre-trained ResNet50 model and fine-tune it on our dataset.
learn = vision_learner(dls, resnet50, metrics=error_rate) learn.fine_tune(3)
Step 3: Model use
- Prediction: Predict the previously downloaded example dog image using the trained model.
is_dog,_,probs = learn.predict(PILImage.create('dog.jpg')) print(f'This is a: {is_dog}.') print(f"Probability it's a dog: {probs[1]:.4f}")
Output result:
This is a: dog. Probability it's a dog: 1.0000
This tutorial shows how to use FastAI to quickly build a simple image classification model. Remember, the accuracy of your model depends on the quality and quantity of your training data.
The above is the detailed content of How to create a model from my data on Kaggle. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

When using Python's pandas library, how to copy whole columns between two DataFrames with different structures is a common problem. Suppose we have two Dats...

How does Uvicorn continuously listen for HTTP requests? Uvicorn is a lightweight web server based on ASGI. One of its core functions is to listen for HTTP requests and proceed...

Fastapi ...

Using python in Linux terminal...

Understanding the anti-crawling strategy of Investing.com Many people often try to crawl news data from Investing.com (https://cn.investing.com/news/latest-news)...
