Home Backend Development Python Tutorial Python匹配中文的正则表达式

Python匹配中文的正则表达式

Jun 10, 2016 pm 03:04 PM
python regular expression

正则表达式并不是Python的一部分。正则表达式是用于处理字符串的强大工具,拥有自己独特的语法以及一个独立的处理引擎,效率上可能不如str自带的方法,但功能十分强大。得益于这一点,在提供了正则表达式的语言里,正则表达式的语法都是一样的,区别只在于不同的编程语言实现支持的语法数量不同;但不用担心,不被支持的语法通常是不常用的部分。

Python正则表达式简介

正则表达式是一个特殊的字符序列,它能帮助你方便的检查一个字符串是否与某种模式匹配。

Python 自1.5版本起增加了re 模块,它提供 Perl 风格的正则表达式模式。

re 模块使 Python 语言拥有全部的正则表达式功能。

compile 函数根据一个模式字符串和可选的标志参数生成一个正则表达式对象。该对象拥有一系列方法用于正则表达式匹配和替换。

re 模块也提供了与这些方法功能完全一致的函数,这些函数使用一个模式字符串做为它们的第一个参数。

以上说的都是给正文做铺垫的,下面看下python正则表达式如何匹配中文的。

# -*- coding: utf-8 -*-
import re
def findPart(regex, text, name):
res=re.findall(regex, text)
if res:
print "There are %d %s parts:\n"% (len(res), name)
for r in res:
print "\t",r.encode("utf8")
print
text ="#who#helloworld#a中文x#"
usample=unicode(text,'utf8')
findPart(u"#[\w\u2E80-\u9FFF]+#", usample, "unicode chinese")
Copy after login

注:

几个主要非英文语系字符范围

2E80~33FFh:中日韩符号区。收容康熙字典部首、中日韩辅助部首、注音符号、日本假名、韩文音符,中日韩的符号、标点、带圈或带括符文数字、月份,以及日本的假名组合、单位、年号、月份、日期、时间等。

3400~4DFFh:中日韩认同表意文字扩充A区,总计收容6,582个中日韩汉字。

4E00~9FFFh:中日韩认同表意文字区,总计收容20,902个中日韩汉字。

A000~A4FFh:彝族文字区,收容中国南方彝族文字和字根。

AC00~D7FFh:韩文拼音组合字区,收容以韩文音符拼成的文字。

F900~FAFFh:中日韩兼容表意文字区,总计收容302个中日韩汉字。

FB00~FFFDh:文字表现形式区,收容组合拉丁文字、希伯来文、阿拉伯文、中日韩直式标点、小符号、半角符号、全角

(
#!/usr/bin/python3
# -*- coding: UTF-8 -*-
import re
message = u'天人合一'.encode('utf8')
print(re.search(u'人'.encode('utf8'), message).group())
交互模式下的例子
>>> import re
>>> s='Phone No. 010-87654321'
>>> 
>>> r=re.compile(r'(\d+)-(\d+)')
>>> m=r.search(s)
>>> m
<_sre.SRE_Match object at 0x010EE218>
)
Copy after login

以上所述是小编给大家介绍的Python正则表达式匹配中文的方法,希望对大家有所帮助!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to use Python regular expressions for Word file processing How to use Python regular expressions for Word file processing Jun 22, 2023 am 09:57 AM

Python regular expression is a powerful matching tool that can help us quickly identify and replace text, styles and formats in Word file processing. This article will introduce how to use Python regular expressions for Word file processing. 1. Install the Python-docx library Python-docx is a functional library for processing Word documents in Python. You can use it to quickly read, modify, create and save Word documents. Before using Python-docx, you need to ensure

How to use Python regular expressions to process numbers and amounts How to use Python regular expressions to process numbers and amounts Jun 23, 2023 am 08:21 AM

Python regular expressions are a powerful tool that help us perform precise and efficient matching and searching in text data. Regular expressions are also extremely useful in the processing of numbers and amounts, and can accurately find and extract the number and amount information. This article will introduce how to use Python regular expressions to process numbers and amounts, helping readers better cope with actual data processing tasks. 1. Process numbers 1. Match integers and floating-point numbers. In regular expressions, to match integers and floating-point numbers, you can use d+ for matching.

How to use Python regular expressions for container orchestration How to use Python regular expressions for container orchestration Jun 22, 2023 am 09:16 AM

In container orchestration, we often need to filter, match, and replace some information. Python provides regular expressions, a powerful tool that can help us complete these operations. This article will introduce how to use Python regular expressions for container orchestration, including basic knowledge of regular expressions, how to use the Pythonre module, and some common regular expression applications. 1. Basic knowledge of regular expressions Regular expression (RegularExpression) refers to a text pattern, used

How to use Python regular expressions for word segmentation How to use Python regular expressions for word segmentation Jun 23, 2023 am 10:37 AM

Python regular expressions are a powerful tool for processing text data. In natural language processing, word segmentation is an important task, which separates a text into individual words. In Python, we can use regular expressions to complete the task of word segmentation. The following will use Python3 as an example to introduce how to use regular expressions for word segmentation. Import the re module The re module is Python's built-in regular expression module. You need to import the module first. import definition text

How to process multi-layered brackets in LaTeX formulas with Python regular expressions and LaTeX parsing library? How to process multi-layered brackets in LaTeX formulas with Python regular expressions and LaTeX parsing library? Apr 01, 2025 pm 12:45 PM

Python regular expressions handle LaTeX multi-layer brackets and build multi-dimensional dictionaries with many LaTeX...

How to use Python regular expressions for code refactoring How to use Python regular expressions for code refactoring Jun 23, 2023 am 09:44 AM

In daily coding, we often need to modify and reconstruct the code to increase the readability and maintainability of the code. One of the important tools is regular expressions. This article will introduce some common techniques on how to use Python regular expressions for code refactoring. 1. Find and Replace One of the most commonly used functions of regular expressions is find and replace. Suppose we need to replace all print statements in the code with logging statements. We can use the following regular expression to find it: prints*((.

How to use Python regular expressions for content extraction How to use Python regular expressions for content extraction Jun 22, 2023 pm 03:04 PM

Python is a widely used high-level programming language with a rich set of libraries and tools that make content extraction easier and more efficient. Among them, regular expressions are a very important tool, and Python provides the re module to use regular expressions for content extraction. This article will introduce you to the specific steps on how to use Python regular expressions for content extraction. 1. Understand the basic syntax of regular expressions. Before using Python regular expressions for content extraction, you first need to understand the basic syntax of regular expressions.

How to use Python regular expressions for data structures and algorithms How to use Python regular expressions for data structures and algorithms Jun 22, 2023 pm 08:01 PM

Python regular expression is a string processing tool based on pattern matching, which can help us extract the required information from text quickly and efficiently. In data structures and algorithms, regular expressions can be used to implement text matching, replacement, segmentation and other functions, providing more powerful support for our programming. This article will introduce how to use Python regular expressions for data structures and algorithms. 1. Basic knowledge of regular expressions Before starting, let’s first understand some basic knowledge of regular expressions: Character set: represented by square brackets,

See all articles