Xer0x's Underground

ls-go (A "ls" clone in Golang)


License

Disclaimer & Intro


This post has been made as my notes, even though I attempt to explain what I have setup/built and how, I do not owe anyone any explanation. Do NOT expect anything.


My blog is my garden.


So I did another mini-project recently, and finally decided to make some notes (write about it) now. It is a clone/replacement of the famous ls program found on *nix systems. DO NOTE it is not 100% complete yet and there will be missing features or good to haves. I do this just for fun, practice and learning. Because, what is computing if not fun?


if you want to learn more about ls, please visit here or here


ls-go vs ls

Featurels-gols
Basic File Listing✅ Done✅ Done
Colorized Output✅ Done✅ Done
Icons Support✅ Done❌ Not Done
Git Status Integration✅ Done❌ Not Done
Tree View Display✅ Done❌ Not Done
Human-Readable Sizes✅ Done✅ Done
File Type Indicators✅ Done✅ Done
Sorting by Modified/Access Time✅ Done✅ Done
JSON Output✅ Done❌ Not Done
Custom Column Layouts✅ Done❌ Not Done
Multithreaded Directory Scanning✅ Done❌ Not Done
Cross-Platform Consistency✅ Done❌ Not Done
Handles Symlinks Gracefully✅ Done✅ Done
File Permissions Display✅ Done✅ Done
Extended Attributes / ACLs✅ Done✅ Done
Remote Filesystem Metadata Caching❌ Not Done❌ Not Done

Legend



Imports


Our custom version of the Unix ls command is leveraging several standard library packages and one external dependency. The fmt, os, path/filepath, runtime, sort, strconv, strings, syscall, time, and io/fs packages handle formatting, file system operations, path manipulation, sorting, and system calls. The os/user package retrieves user and group information. The external package github.com/alitto/pond provides a worker pool for concurrent file processing, improving performance on multi-core systems.


package main

import (
	"fmt"
	"io/fs"
	"os"
	"os/user"
	"path/filepath"
	"runtime"
	"sort"
	"strconv"
	"strings"
	"syscall"
	"time"

	"github.com/alitto/pond"
)

FileInfo Struct


The FileInfo struct extends the standard fs.FileInfo interface to store detailed file metadata, including name, mode, size, modification time, access time, change time, inode number, block count, link count, user ID, group ID, and device numbers (major/minor). It also tracks whether the file is a directory or symlink and includes the symlink target and file flags. This struct is central to capturing and displaying comprehensive file information.


type FileInfo struct {
	Name       string
	Mode       fs.FileMode
	Size       int64
	ModTime    time.Time
	AccessTime time.Time
	ChangeTime time.Time
	Inode      uint64
	Blocks     int64
	Links      uint64
	Uid        uint32
	Gid        uint32
	Major      uint32
	Minor      uint32
	IsDir      bool
	IsSymlink  bool
	LinkTarget string
	Flags      uint32
}

Options Struct


The Options struct defines command-line flags that control the behavior of the ls command. It includes flags for output formatting (e.g., -l for long format, -C for columns), sorting (e.g., -t for time, -S for size), file inclusion (e.g., -a for all files, -A for almost all), and other behaviors like recursion (-R), human-readable sizes (-h), and symlink handling (-L, -H). This struct allows flexible customization of the command's output.

type Options struct {
	One           bool // -1
	All           bool // -a
	AlmostAll     bool // -A
	Classify      bool // -F
	NoSort        bool // -f
	LongFormat    bool // -l
	GroupFormat   bool // -g
	NumericFormat bool // -n
	Columns       bool // -C
	Stream        bool // -m
	Comma         bool // -x
	Directory     bool // -d
	Human         bool // -h
	Inode         bool // -i
	Kilobytes     bool // -k
	Follow        bool // -L
	NoFollow      bool // -H
	Flags         bool // -o
	Slash         bool // -p
	Quote         bool // -q
	Recursive     bool // -R
	Reverse       bool // -r
	SizeSort      bool // -S
	Blocks        bool // -s
	TimeSort      bool // -t
	AccessTime    bool // -u
	ChangeTime    bool // -c
	FullTime      bool // -T
}

Main Function


The main function serves as the entry point, initializing a worker pool using the pond package for concurrent file processing. It parses command-line arguments, checks for the --help flag to display usage information, and processes the provided files or defaults to the current directory (.). The function orchestrates the high-level workflow, delegating tasks to helper functions for argument parsing, file processing, and display.


func main() {
	args := os.Args[1:]

	// Check for --help flag
	for _, arg := range args {
		if arg == "--help" {
			printHelp()
			return
		}
	}

	// Initialize worker pool
	maxWorkers := min(MAX_WORKERS, runtime.NumCPU()*4)
	pool = pond.New(maxWorkers, maxWorkers*2)
	defer pool.StopAndWait()

	files := parseArgs(args)

	if len(files) == 0 {
		files = []string{"."}
	}

	// Process files concurrently
	processFiles(files)
}

parseArgs Function


The parseArgs function processes command-line arguments, identifying files and flags. It handles combined flags (e.g., -la) by iterating through each character and setting the corresponding Options struct fields. It resolves conflicts (e.g., -l overriding -g) and ensures that -f implies -a. The function returns a list of file paths to process, separating them from the flags.

func parseArgs(args []string) []string {
	var files []string

	for i := 0; i < len(args); i++ {
		arg := args[i]
		if !strings.HasPrefix(arg, "-") {
			files = append(files, arg)
			continue
		}

		// Handle combined flags like -la
		flags := arg[1:]
		for _, flag := range flags {
			switch flag {
			case '1':
				opts.One = true
			case 'a':
				opts.All = true
			case 'A':
				opts.AlmostAll = true
			case 'C':
				opts.Columns = true
			case 'c':
				opts.ChangeTime = true
			case 'd':
				opts.Directory = true
			case 'F':
				opts.Classify = true
			case 'f':
				opts.NoSort = true
				opts.All = true // -f implies -a
			case 'g':
				opts.GroupFormat = true
				opts.LongFormat = true
			case 'H':
				opts.NoFollow = true
			case 'h':
				opts.Human = true
			case 'i':
				opts.Inode = true
			case 'k':
				opts.Kilobytes = true
			case 'L':
				opts.Follow = true
			case 'l':
				opts.LongFormat = true
			case 'm':
				opts.Stream = true
			case 'n':
				opts.NumericFormat = true
				opts.LongFormat = true
			case 'o':
				opts.Flags = true
			case 'p':
				opts.Slash = true
			case 'q':
				opts.Quote = true
			case 'R':
				opts.Recursive = true
			case 'r':
				opts.Reverse = true
			case 'S':
				opts.SizeSort = true
			case 's':
				opts.Blocks = true
			case 'T':
				opts.FullTime = true
			case 't':
				opts.TimeSort = true
			case 'u':
				opts.AccessTime = true
			case 'x':
				opts.Comma = true
			}
		}
	}

	// Handle conflicting options
	if opts.LongFormat {
		opts.GroupFormat = opts.GroupFormat // -l overrides -g
	}

	if opts.NoSort {
		opts.TimeSort = false
		opts.SizeSort = false
	}

	return files
}

processFiles and processDirectory Functions


The processFiles function separates directories from non-directories, sorts them based on the specified options, and displays non-directories first. For directories, it calls processDirectory, which reads directory contents using readDirFast and filters entries based on flags like -a or -A. Both functions support recursive processing (-R) and concurrent execution via the worker pool, ensuring efficient handling of large directories.


func processFiles(files []string) {
	var dirs, nonDirs []FileInfo

	// Separate directories from non-directories
	for _, file := range files {
		info, err := getFileInfo(file)
		if err != nil {
			fmt.Fprintf(os.Stderr, "ls: %s: %v\n", file, err)
			continue
		}

		if info.IsDir && !opts.Directory {
			dirs = append(dirs, *info)
		} else {
			nonDirs = append(nonDirs, *info)
		}
	}

	// Sort and display non-directories first
	if len(nonDirs) > 0 {
		sortFiles(nonDirs)
		displayFiles(nonDirs, "")
	}

	// Process directories
	sortFiles(dirs)
	for i, dir := range dirs {
		if len(files) > 1 || opts.Recursive {
			if i > 0 || len(nonDirs) > 0 {
				fmt.Println()
			}
			fmt.Printf("%s:\n", dir.Name)
		}
		processDirectory(dir.Name)

		if opts.Recursive {
			processRecursive(dir.Name)
		}
	}
}

func processDirectory(dirPath string) {
	entries, err := readDirFast(dirPath)
	if err != nil {
		fmt.Fprintf(os.Stderr, "ls: %s: %v\n", dirPath, err)
		return
	}

	// Filter entries
	var filtered []FileInfo
	for _, entry := range entries {
		if shouldSkipEntry(entry.Name) {
			continue
		}
		filtered = append(filtered, entry)
	}

	sortFiles(filtered)
	displayFiles(filtered, dirPath)
}

readDirFast Function


The readDirFast function reads directory entries in batches using os.Readdir and processes them concurrently with the worker pool. It converts each entry into a FileInfo struct, fetching additional metadata via syscall.Lstat for compatibility. This approach minimizes I/O overhead and maximizes CPU utilization, making it efficient for large directories.


func readDirFast(dirPath string) ([]FileInfo, error) {
	file, err := os.Open(dirPath)
	if err != nil {
		return nil, err
	}
	defer file.Close()

	// Read directory entries in batches
	const batchSize = 1000
	var allEntries []FileInfo

	for {
		entries, err := file.Readdir(batchSize)
		if err != nil {
			if len(entries) == 0 {
				break
			}
		}

		if len(entries) == 0 {
			break
		}

		// Process entries concurrently
		infoChan := make(chan FileInfo, len(entries))

		for _, entry := range entries {
			pool.Submit(func(entry fs.FileInfo) func() {
				return func() {
					fullPath := filepath.Join(dirPath, entry.Name())
					info := convertFileInfo(entry, fullPath)
					infoChan <- *info
				}
			}(entry))
		}

		// Collect results
		for i := 0; i < len(entries); i++ {
			allEntries = append(allEntries, <-infoChan)
		}

		if err != nil {
			break
		}
	}

	return allEntries, nil
}

displayLongFormat and formatLongLine Functions


The displayLongFormat function generates detailed output for files, including inode numbers, block counts, permissions, links, owner, group, size, and timestamps. It calculates total blocks and formats each file using formatLongLine, which constructs a line with fields like mode, size (human-readable with -h), and time (full format with -T). The function supports flags like -i, -s, -n, and -o for additional metadata.

func displayLongFormat(files []FileInfo) {
	// Calculate total blocks
	var totalBlocks int64
	for _, file := range files {
		totalBlocks += file.Blocks
	}

	if opts.Kilobytes {
		totalBlocks = (totalBlocks * BLOCKSIZE) / 1024
	}

	if len(files) > 0 {
		fmt.Printf("total %d\n", totalBlocks)
	}

	for _, file := range files {
		line := formatLongLine(file)
		fmt.Println(line)
	}
}

func formatLongLine(file FileInfo) string {
	var parts []string

	// Inode
	if opts.Inode {
		parts = append(parts, fmt.Sprintf("%8d", file.Inode))
	}

	// Blocks
	if opts.Blocks {
		blocks := file.Blocks
		if opts.Kilobytes && blocks > 0 {
			blocks = (blocks * BLOCKSIZE) / 1024
		}
		parts = append(parts, fmt.Sprintf("%6d", blocks))
	}

	// Mode
	parts = append(parts, formatMode(file.Mode, file.IsSymlink))

	// Links
	parts = append(parts, fmt.Sprintf("%3d", file.Links))

	// Owner
	if !opts.GroupFormat {
		if opts.NumericFormat {
			parts = append(parts, fmt.Sprintf("%-8d", file.Uid))
		} else {
			parts = append(parts, fmt.Sprintf("%-8s", getUserName(file.Uid)))
		}
	}

	// Group
	if opts.NumericFormat {
		parts = append(parts, fmt.Sprintf("%-8d", file.Gid))
	} else {
		parts = append(parts, fmt.Sprintf("%-8s", getGroupName(file.Gid)))
	}

	// Flags
	if opts.Flags {
		parts = append(parts, formatFlags(file.Flags))
	}

	// Size or device numbers
	if file.Major != 0 || file.Minor != 0 {
		parts = append(parts, fmt.Sprintf("%3d, %3d", file.Major, file.Minor))
	} else {
		sizeStr := formatSize(file.Size)
		parts = append(parts, fmt.Sprintf("%8s", sizeStr))
	}

	// Time
	timeStr := formatTime(file.ModTime, file.AccessTime, file.ChangeTime)
	parts = append(parts, timeStr)

	// Name
	name := file.Name
	if opts.Quote {
		name = quoteFileName(name)
	}

	if opts.Classify {
		name += getClassifyChar(file)
	} else if opts.Slash && file.IsDir {
		name += "/"
	}

	if file.IsSymlink && file.LinkTarget != "" {
		name += " -> " + file.LinkTarget
	}

	parts = append(parts, name)

	return strings.Join(parts, " ")
}

formatMode and formatSize Functions


The formatMode function converts a file's mode into a string representation (e.g., drwxr-xr-x), handling file types (directory, symlink, etc.) and permissions, including special bits like setuid, setgid, and sticky. The formatSize function formats file sizes, supporting human-readable output (-h) with units like KB, MB, GB, etc., for improved readability.

func formatMode(mode fs.FileMode, isSymlink bool) string {
	var buf [10]byte

	// File type
	switch mode & fs.ModeType {
	case fs.ModeDir:
		buf[0] = 'd'
	case fs.ModeSymlink:
		buf[0] = 'l'
	case fs.ModeNamedPipe:
		buf[0] = 'p'
	case fs.ModeSocket:
		buf[0] = 's'
	case fs.ModeDevice:
		buf[0] = 'b'
	case fs.ModeCharDevice:
		buf[0] = 'c'
	default:
		buf[0] = '-'
	}

	// Permissions
	perm := mode.Perm()

	// Owner permissions
	if perm&0400 != 0 {
		buf[1] = 'r'
	} else {
		buf[1] = '-'
	}
	if perm&0200 != 0 {
		buf[2] = 'w'
	} else {
		buf[2] = '-'
	}
	switch {
	case perm&0100 != 0 && mode&fs.ModeSetuid != 0:
		buf[3] = 's'
	case perm&0100 != 0:
		buf[3] = 'x'
	case mode&fs.ModeSetuid != 0:
		buf[3] = 'S'
	default:
		buf[3] = '-'
	}

	// Group permissions
	if perm&0040 != 0 {
		buf[4] = 'r'
	} else {
		buf[4] = '-'
	}
	if perm&0020 != 0 {
		buf[5] = 'w'
	} else {
		buf[5] = '-'
	}
	switch {
	case perm&0010 != 0 && mode&fs.ModeSetgid != 0:
		buf[6] = 's'
	case perm&0010 != 0:
		buf[6] = 'x'
	case mode&fs.ModeSetgid != 0:
		buf[6] = 'S'
	default:
		buf[6] = '-'
	}

	// Other permissions
	if perm&0004 != 0 {
		buf[7] = 'r'
	} else {
		buf[7] = '-'
	}
	if perm&0002 != 0 {
		buf[8] = 'w'
	} else {
		buf[8] = '-'
	}
	switch {
	case perm&0001 != 0 && mode&fs.ModeSticky != 0:
		buf[9] = 't'
	case perm&0001 != 0:
		buf[9] = 'x'
	case mode&fs.ModeSticky != 0:
		buf[9] = 'T'
	default:
		buf[9] = '-'
	}

	return string(buf[:])
}

func formatSize(size int64) string {
	if !opts.Human {
		return strconv.FormatInt(size, 10)
	}

	const (
		B  = 1
		KB = 1024 * B
		MB = 1024 * KB
		GB = 1024 * MB
		TB = 1024 * GB
		PB = 1024 * TB
		EB = 1024 * PB
	)

	switch {
	case size >= EB:
		return fmt.Sprintf("%.1fE", float64(size)/EB)
	case size >= PB:
		return fmt.Sprintf("%.1fP", float64(size)/PB)
	case size >= TB:
		return fmt.Sprintf("%.1fT", float64(size)/TB)
	case size >= GB:
		return fmt.Sprintf("%.1fG", float64(size)/GB)
	case size >= MB:
		return fmt.Sprintf("%.1fM", float64(size)/MB)
	case size >= KB:
		return fmt.Sprintf("%.1fK", float64(size)/KB)
	default:
		return strconv.FormatInt(size, 10)
	}
}

Screenshot


Screenshot 2025-06-21 at 12


Conclusion


I am aware that there are some issues, They will get resolved when I get time. Hope you liked this nice friday evening code. You are free to use it in your org/home as long as you follow the license.



Visit GitHub Repository


gladgers-hacker-gers-guardians-of-galaxy



Twitter LinkedIn Contact me on Signal

Contact me via email

#Linux #MacOS #OpenBSD #development #golang #research

← Back to blog