关于kafka可能会丢失消息的问题?
各位大神,小弟最近研究kafka,看了很多说kafka可能会丢失消息。
实在不太明白做日志系统,在什么场景下可以容忍消息的丢失。
比如做实时日志分析系统的话,那么就是说我所看到的日志信息中可能是不全的,如果出现异常日志看不到可能会影响问题的定位?
还有看到说分布式集群kafka的某一节点崩溃可能也会导致这一节点消息的丢失(这个看kafka与rabbitMQ做对比的时候说到的,rabbitMQ不会有这个问题)。
如果说kafka这么不靠谱,为啥这么多大公司都在使用呢?
回复内容:
各位大神,小弟最近研究kafka,看了很多说kafka可能会丢失消息。
实在不太明白做日志系统,在什么场景下可以容忍消息的丢失。
比如做实时日志分析系统的话,那么就是说我所看到的日志信息中可能是不全的,如果出现异常日志看不到可能会影响问题的定位?
还有看到说分布式集群kafka的某一节点崩溃可能也会导致这一节点消息的丢失(这个看kafka与rabbitMQ做对比的时候说到的,rabbitMQ不会有这个问题)。
如果说kafka这么不靠谱,为啥这么多大公司都在使用呢?
rabbitMQ怎么做的,我不知道。
kafka会丢消息主要集中在两个环节
消息落盘时机
消息落盘有异步刷新和同步刷新两种,明显异步刷新的可靠性要高很多。但在某些场景下追求性能而忽略可靠性,可以启用。
消息存储维护
持久化存储,这句话不是说来玩的。Oracle/MySQL做了这么久的存储,其中的灾难恢复工具等都非常完备并形成体系(出问题你能找到人并能解决问题)kafka的存储谁特么知道~工具又特么的少!
另外就是落盘的存储介质,如果不做raid,那么单盘存在损坏的可能;做了raid,则成本上升。如果做多集copy,则存在网络同步延时所带来的瞬间数据不一致。
小结:kafka你要做到完全不丢数据(在非大灾大难的情况下,比如机房被原子弹轰炸;或者raid被误操作弄错同步时间或者低格等),是完全可以的。代价就是丢失一定的性能。
所以kafka我一般用在业务允许少量数据丢失但整体吞吐量非常大的场景(比如日志采集),数据统计分析(却少几百条数据不会对亿万级的样本空间产生什么影响)。
kafka也可以用在两个可靠存储之间做数据同步,比如MySQL(写)->MySQL(度),因为MySQL(写)保证了数据可被重放,所以kafka出问题时恢复速度和恢复可靠程度是可以得到保证的。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











IIS and PHP are compatible and are implemented through FastCGI. 1.IIS forwards the .php file request to the FastCGI module through the configuration file. 2. The FastCGI module starts the PHP process to process requests to improve performance and stability. 3. In actual applications, you need to pay attention to configuration details, error debugging and performance optimization.

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Laravel is suitable for projects that teams are familiar with PHP and require rich features, while Python frameworks depend on project requirements. 1.Laravel provides elegant syntax and rich features, suitable for projects that require rapid development and flexibility. 2. Django is suitable for complex applications because of its "battery inclusion" concept. 3.Flask is suitable for fast prototypes and small projects, providing great flexibility.

Both Python and JavaScript's choices in development environments are important. 1) Python's development environment includes PyCharm, JupyterNotebook and Anaconda, which are suitable for data science and rapid prototyping. 2) The development environment of JavaScript includes Node.js, VSCode and Webpack, which are suitable for front-end and back-end development. Choosing the right tools according to project needs can improve development efficiency and project success rate.

Golangisidealforbuildingscalablesystemsduetoitsefficiencyandconcurrency,whilePythonexcelsinquickscriptinganddataanalysisduetoitssimplicityandvastecosystem.Golang'sdesignencouragesclean,readablecodeanditsgoroutinesenableefficientconcurrentoperations,t

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python is more suitable for data science and automation, while JavaScript is more suitable for front-end and full-stack development. 1. Python performs well in data science and machine learning, using libraries such as NumPy and Pandas for data processing and modeling. 2. Python is concise and efficient in automation and scripting. 3. JavaScript is indispensable in front-end development and is used to build dynamic web pages and single-page applications. 4. JavaScript plays a role in back-end development through Node.js and supports full-stack development.
