Minishell is a cursus project in the 42 network programming school. The goal of this project is to implement a simple version of a Bash-like shell in C, supporting features such as pipes, redirections, expansions, and command execution.
The project aims to provide a hands-on experience in developing a command-line interpreter, exploring concepts like process management, signal handling, and parsing user input. By building a mini shell, students gain a deeper understanding of how shell programs work under the hood and learn about the intricacies of handling user interactions and executing commands.
Minishell follows a simple yet effective workflow to process user input and execute commands:
-
Read Line: The shell prompts the user for input and reads the entered command line.
-
Parse Input: The input line is then parsed to identify individual tokens, such as the command name, arguments, and other special characters like pipes, redirections, or expansions.
-
Execute Command: After parsing the input, the shell creates the necessary processes and executes the specified command. This may involve setting up pipes, handling redirections, or performing expansions as required.
-
Repeat: Once the command execution is complete, the shell returns to the initial state and prompts the user for a new command, repeating the cycle.
This continuous loop of reading input, parsing, executing commands, and repeating allows the shell to provide an interactive command-line interface for users to execute various commands and perform complex operations.
The parsing process in Minishell involves two main stages: lexing and parsing.
The lexer, or tokenizer, is responsible for breaking down the input command line into a sequence of meaningful tokens. These tokens can represent commands, arguments, operators (such as pipes or redirections), or other special characters.
The lexer scans the input string character by character, applying a set of rules to identify and categorize the different token types. It generates a stream of tokens that serves as input for the parser.
The parser takes the token stream generated by the lexer and constructs an abstract syntax tree (AST) that represents the structure of the command line. The AST is a hierarchical tree-like data structure that captures the relationships between commands, arguments, and operators.
The parser follows a set of grammar rules to analyze the token stream and build the AST. These rules define the valid syntax and semantics of the command line, ensuring that the input is correctly interpreted.
Once the AST is constructed, it can be traversed and processed by the shell to execute the specified commands, handle redirections, set up pipes, and perform any necessary expansions or substitutions.
After the parsing stage, the shell enters the execution phase, where it processes the abstract syntax tree (AST) generated by the parser and carries out the specified commands and operations.
The shell traverses the AST, following the structure defined by the parser. During this traversal, it identifies the commands to be executed and their associated arguments, as well as any operators or special characters that require specific handling.
When a command is encountered, the shell checks if it is a built-in command or an external program. Built-in commands are executed directly within the shell process, while external programs are executed by creating a new child process using the fork()
system call and invoking the appropriate program via an exec()
system call.
Minishell supports variable expansion, allowing users to define and reference environment variables within their commands. During the execution phase, the shell scans the command line for variable references (e.g., $VAR
) and substitutes them with their corresponding values.
Wildcard expansion is another feature supported by Minishell. When a command line contains wildcard characters (e.g., *
, ?
), the shell expands them to match the corresponding files or directories in the current working directory.
Minishell includes a set of built-in commands that are implemented directly within the shell. These commands provide functionality such as changing directories (cd
) or setting environment variables (export
). Built-in commands are executed without creating a new process, improving efficiency and reducing overhead.
The shell also handles various signals, such as SIGINT
(Ctrl+C) and SIGQUIT
(Ctrl+). Appropriate actions are taken in response to these signals, such as terminating the currently running process or printing diagnostic messages.
Now that we've covered the theoretical aspects of how Minishell works, it's time to dive into the implementation details. In this section, we'll explore the core components of the shell, including the lexer, parser, and executor.
It's important to note that the code snippets provided here represent my approach to implementing Minishell. While they serve as a reference and a starting point, feel free to adjust the data structures, algorithms, and overall design as per your understanding and coding style.
Remember, there are multiple ways to achieve the same functionality, and the true learning experience lies in understanding the underlying concepts and crafting your own solution. The goal is not to merely copy and paste the code but to use it as a guide and adapt it to your specific needs and preferences.