The simplest implementation of database
Among all application software, the database may be the most complex.
MySQL’s manual has more than 3,000 pages, PostgreSQL’s manual has more than 2,000 pages, and Oracle’s manual is thicker than both of them combined.
However, it is not difficult to write the simplest database by yourself. There is a post on Reddit that explains the principle clearly in just a few hundred words. Below is what I compiled based on this post.
1. Save data in text form
The first step is to write the data you want to save into a text file. This text file is your database.
In order to facilitate reading, the data must be divided into records, and the length of each record is specified to be equal. For example, assuming that the length of each record is 800 bytes, the starting position of the fifth record is 3200 bytes.
Most of the time, we don’t know the position of a certain record, we only know the value of the primary key. At this time, in order to read the data, you can compare the records one by one. However, this is too inefficient. In practical applications, databases often use B-tree format to store data.
2. What is a B-tree?
To understand B-tree, we must start from the binary search tree.
Binary search tree is a data structure with very high search efficiency. It has three characteristics.
(1) Each node has at most two subtrees.
(2) The left subtree has a value less than the parent node, and the right subtree has a value greater than the parent node.
(3) To find the target value among n nodes, generally only log(n) comparisons are required.
The structure of the binary search tree is not suitable for databases because its search efficiency is related to the number of levels. The lower the data is, the more comparisons are needed. In extreme cases, n data requires n comparisons to find the target value. For the database, every time you enter a layer, you have to read data from the hard disk. This is very fatal, because the reading time of the hard disk is much longer than the data processing time. The fewer times the database reads the hard disk, the better.
The B-tree is an improvement on the binary search tree. Its design idea is to gather related data together as much as possible so that multiple data can be read at one time and the number of hard disk operations can be reduced.
B-tree also has three characteristics.
(1) A node can hold multiple values. For example, in the figure above, the largest node holds 4 values.
(2) New layers will not be added unless the data is already filled. In other words, B-tree pursues as few "layers" as possible.
(3) The value in the child node has a strict size correspondence with the value in the parent node. Generally speaking, if the parent node has a value, then there are a+1 child nodes. For example, in the picture above, the parent node has two values (7 and 16), which correspond to three child nodes. The first child node has a value less than 7, the last child node has a value greater than 16, and the middle child node It's a value between 7 and 16.
This data structure is very helpful in reducing the number of reads from the hard disk. Assuming that a node can hold 100 values, then a 3-layer B-tree can hold 1 million data. If it is replaced by a binary search tree, 20 layers are needed! Assuming that the operating system reads one node at a time and the root node remains in memory, then the B-tree only needs to read the hard disk twice to find the target value among 1 million pieces of data.
3. Index
The database is stored in B-tree format, which only solves the problem of searching data according to the "primary key". If you want to find other fields, you need to create an index.
The so-called index is a B-tree file with a certain field as the key. Suppose there is an "employee table" containing two fields: employee number (primary key) and name. An index file can be created for names. This file stores names in B-tree format, and each name is followed by its position in the database (i.e. which record). When searching for a name, first find the corresponding record from the index, and then read it from the table.
This index search method is called "Indexed Sequential Access Method", abbreviated as ISAM. It already has multiple implementations (such as C-ISAM library and D-ISAM library). As long as you use these code libraries, you can write the simplest database by yourself.
4. Advanced functions
After deploying the most basic data access (including indexing), some advanced functions can also be implemented.
(1) SQL language is a universal operating language for databases, so a SQL parser is needed to parse SQL commands into corresponding ISAM operations.
(2) Database connection (join) refers to the establishment of a connection relationship between two tables in the database through "foreign keys". You need to optimize this operation.
(3) Database transaction (transaction) refers to a series of database operations in batches. As long as one step fails, the entire operation will be unsuccessful. Therefore, it is necessary to have an "operation log" so that the operation can be rolled back when it fails.
(4) Backup mechanism: Save a copy of the database.
(5) Remote operation: Allows users to operate the database on different machines through TCP/IP protocol.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











MySQL and phpMyAdmin are powerful database management tools. 1) MySQL is used to create databases and tables, and to execute DML and SQL queries. 2) phpMyAdmin provides an intuitive interface for database management, table structure management, data operations and user permission management.

Oracle is not only a database company, but also a leader in cloud computing and ERP systems. 1. Oracle provides comprehensive solutions from database to cloud services and ERP systems. 2. OracleCloud challenges AWS and Azure, providing IaaS, PaaS and SaaS services. 3. Oracle's ERP systems such as E-BusinessSuite and FusionApplications help enterprises optimize operations.

In MySQL, the function of foreign keys is to establish the relationship between tables and ensure the consistency and integrity of the data. Foreign keys maintain the effectiveness of data through reference integrity checks and cascading operations. Pay attention to performance optimization and avoid common errors when using them.

The main difference between MySQL and MariaDB is performance, functionality and license: 1. MySQL is developed by Oracle, and MariaDB is its fork. 2. MariaDB may perform better in high load environments. 3.MariaDB provides more storage engines and functions. 4.MySQL adopts a dual license, and MariaDB is completely open source. The existing infrastructure, performance requirements, functional requirements and license costs should be taken into account when choosing.

MySQL and phpMyAdmin can be effectively managed through the following steps: 1. Create and delete database: Just click in phpMyAdmin to complete. 2. Manage tables: You can create tables, modify structures, and add indexes. 3. Data operation: Supports inserting, updating, deleting data and executing SQL queries. 4. Import and export data: Supports SQL, CSV, XML and other formats. 5. Optimization and monitoring: Use the OPTIMIZETABLE command to optimize tables and use query analyzers and monitoring tools to solve performance problems.

SQL is a standard language for managing relational databases, while MySQL is a database management system that uses SQL. SQL defines ways to interact with a database, including CRUD operations, while MySQL implements the SQL standard and provides additional features such as stored procedures and triggers.

Redis is a memory data structure storage system, mainly used as a database, cache and message broker. Its core features include single-threaded model, I/O multiplexing, persistence mechanism, replication and clustering functions. Redis is commonly used in practical applications for caching, session storage, and message queues. It can significantly improve its performance by selecting the right data structure, using pipelines and transactions, and monitoring and tuning.

In MySQL, add fields using ALTERTABLEtable_nameADDCOLUMNnew_columnVARCHAR(255)AFTERexisting_column, delete fields using ALTERTABLEtable_nameDROPCOLUMNcolumn_to_drop. When adding fields, you need to specify a location to optimize query performance and data structure; before deleting fields, you need to confirm that the operation is irreversible; modifying table structure using online DDL, backup data, test environment, and low-load time periods is performance optimization and best practice.
