Table of Contents
Question content
What I tried
Workaround
Home Backend Development Golang How to read and format a text stream received through a bash pipe?

How to read and format a text stream received through a bash pipe?

Feb 10, 2024 pm 11:30 PM
overflow standard library

如何读取和格式化通过 bash 管道接收的文本流?

In our daily work, we often need to process text data through command line tools. In Linux systems, bash pipe (pipe) is a very powerful tool that can use the output of one command as the input of another command. But when we receive a large text stream through a pipe, how do we efficiently read and format this data? This article will introduce you to some practical tips and methods to help you better handle text streams received through bash pipes. Whether you are a beginner or an experienced developer, this article will bring you some inspiration and help.

Question content

Currently, I'm using the following to format data in an npm script.

npm run startwin | while ifs= read -r line; do printf '%b\n' "$line"; done | less
Copy after login

It works, but my coworker doesn't use linux. So, I want to implement while ifs= read -r line; execute printf '%b\n' "$line"; done in go and use the binary file in the pipeline.

npm run startwin | magical-go-formater
Copy after login

What I tried

package main

import (
    "fmt"
    "io/ioutil"
    "os"
    "strings"
)

func main() {
  fi, _ := os.Stdin.Stat() // get the FileInfo struct

  if (fi.Mode() & os.ModeCharDevice) == 0 {

    bytes, _ := ioutil.ReadAll(os.Stdin)
    str := string(bytes)
    arr := strings.Fields(str)

    for _, v := range arr {
      fmt.Println(v)
    }
}

Copy after login

Currently, the program silences all output of the text stream.

Workaround

You want to use bufio.scanner for tail type reading. IMHO the check you did on os.stdin is unnecessary, but ymmv.

See this answer for an example. ioutil.readall() (now deprecated, just use io.readall()) reads errors/eof, but it's not looping over input - that's what you needbufio.scanner.scan() reason.

Additionally - %b will convert any escape sequences in the text - e.g. any \n in the passed line will be rendered as a newline - do you need that? b/c go has no equivalent format specifier, afaik.

edit

So I think, your approach based on readall() will/might work...eventually. I guess the behavior you expect is similar to bufio.scanner - the receiving process processes bytes as they are written (this is actually a polling operation - see scan() of the standard library source code to see the dirty details) .

But readall() buffers everything read and does not return until an error eventually occurs or eof occurs. I cracked the instrumented version of readall() (which is an exact copy of the standard library source code, with just a little extra instrumentation output) and you can see that it's reading as bytes are being written, But it just doesn't return and produce content until the writing process is complete, at which point it closes the end of the pipe (its open file handle), thus generating an eof:

package main

import (
    "fmt"
    "io"
    "os"
    "time"
)

func main() {

    // os.stdin.setreaddeadline(time.now().add(2 * time.second))

    b, err := readall(os.stdin)
    if err != nil {
        fmt.println("error: ", err.error())
    }

    str := string(b)
    fmt.println(str)
}

func readall(r io.reader) ([]byte, error) {
    b := make([]byte, 0, 512)
    i := 0
    for {
        if len(b) == cap(b) {
            // add more capacity (let append pick how much).
            b = append(b, 0)[:len(b)]
        }
        n, err := r.read(b[len(b):cap(b)])

        //fmt.fprintf(os.stderr, "read %d - received: \n%s\n", i, string(b[len(b):cap(b)]))
        fmt.fprintf(os.stderr, "%s read %d - received %d bytes\n", time.now(), i, n)
        i++

        b = b[:len(b)+n]
        if err != nil {
            if err == io.eof {
                fmt.fprintln(os.stderr, "received eof")
                err = nil
            }
            return b, err
        }
    }
}
Copy after login

I just wrote a cheap script to generate input, simulate some long running stuff and only write periodically, I imagine how npm would behave in your case:

#!/bin/sh

for x in 1 2 3 4 5 6 7 8 9 10
do
  cat ./main.go
  sleep 10
done
Copy after login

BTW, I find reading the actual standard library code really helpful... or at least interesting in cases like this.

The above is the detailed content of How to read and format a text stream received through a bash pipe?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Four ways to implement multithreading in C language Four ways to implement multithreading in C language Apr 03, 2025 pm 03:00 PM

Multithreading in the language can greatly improve program efficiency. There are four main ways to implement multithreading in C language: Create independent processes: Create multiple independently running processes, each process has its own memory space. Pseudo-multithreading: Create multiple execution streams in a process that share the same memory space and execute alternately. Multi-threaded library: Use multi-threaded libraries such as pthreads to create and manage threads, providing rich thread operation functions. Coroutine: A lightweight multi-threaded implementation that divides tasks into small subtasks and executes them in turn.

Is H5 page production a front-end development? Is H5 page production a front-end development? Apr 05, 2025 pm 11:42 PM

Yes, H5 page production is an important implementation method for front-end development, involving core technologies such as HTML, CSS and JavaScript. Developers build dynamic and powerful H5 pages by cleverly combining these technologies, such as using the <canvas> tag to draw graphics or using JavaScript to control interaction behavior.

What is sum generally used for in C language? What is sum generally used for in C language? Apr 03, 2025 pm 02:39 PM

There is no function named "sum" in the C language standard library. "sum" is usually defined by programmers or provided in specific libraries, and its functionality depends on the specific implementation. Common scenarios are summing for arrays, and can also be used in other data structures, such as linked lists. In addition, "sum" is also used in fields such as image processing and statistical analysis. An excellent "sum" function should have good readability, robustness and efficiency.

Why are the inline-block elements misaligned? How to solve this problem? Why are the inline-block elements misaligned? How to solve this problem? Apr 04, 2025 pm 10:39 PM

Regarding the reasons and solutions for misaligned display of inline-block elements. When writing web page layout, we often encounter some seemingly strange display problems. Compare...

How to customize the resize symbol through CSS and make it uniform with the background color? How to customize the resize symbol through CSS and make it uniform with the background color? Apr 05, 2025 pm 02:30 PM

The method of customizing resize symbols in CSS is unified with background colors. In daily development, we often encounter situations where we need to customize user interface details, such as adjusting...

Which libraries in Go are developed by large companies or provided by well-known open source projects? Which libraries in Go are developed by large companies or provided by well-known open source projects? Apr 02, 2025 pm 04:12 PM

Which libraries in Go are developed by large companies or well-known open source projects? When programming in Go, developers often encounter some common needs, ...

How to use the clip-path attribute of CSS to achieve the 45-degree curve effect of segmenter? How to use the clip-path attribute of CSS to achieve the 45-degree curve effect of segmenter? Apr 04, 2025 pm 11:45 PM

How to achieve the 45-degree curve effect of segmenter? In the process of implementing the segmenter, how to make the right border turn into a 45-degree curve when clicking the left button, and the point...

distinct function usage distance function c usage tutorial distinct function usage distance function c usage tutorial Apr 03, 2025 pm 10:27 PM

std::unique removes adjacent duplicate elements in the container and moves them to the end, returning an iterator pointing to the first duplicate element. std::distance calculates the distance between two iterators, that is, the number of elements they point to. These two functions are useful for optimizing code and improving efficiency, but there are also some pitfalls to be paid attention to, such as: std::unique only deals with adjacent duplicate elements. std::distance is less efficient when dealing with non-random access iterators. By mastering these features and best practices, you can fully utilize the power of these two functions.

See all articles