scribery / tlog Goto Github PK

View Code? Open in Web Editor NEW

301.0 24.0 53.0 1.43 MB

Terminal I/O logger

Home Page: http://scribery.github.io/tlog/

License: GNU General Public License v2.0

C 78.32% Makefile 2.34% M4 13.35% Python 5.71% Shell 0.28%

terminal log session playback recording json elasticsearch syslog rsyslog stream

tlog's People

Contributors

Stargazers

Watchers

tlog's Issues

Support running tlog-rec without a terminal

Tlog-rec needs to support running without a terminal connected to either or both of STDIN/STDOUT, but to e.g. a file or a pipe. This is particularly necessary for ssh sessions with command to execute. E.g. 'ssh host command arg' invocations.

Make configuration validation function return error message

Make tlog-rec and the future tlog-play configuration validation function return validation error string instead of printing it, so the function could be used in assertions.

Add assertions using it all over configuration loading and using routines.

Add tlog-rec to /etc/shells upon installation

Split JSON encoding streams into forks

Split JSON encoding streams into two forks each: one for text and another for binary, each having the same interface and working with the dispatcher. This would also require changing the timing string format to have symmetric operations for reading/skipping and would actually make it simpler. This can also make stream implementations simpler and more self-contained, even though the actual code size would increase.

Add coding style verification test

Implement a test verifying the coding style of the source. Use Indent, ClangFormat, or similar.

Fix error handling in tlog-rec.c

Tlog-rec.c has a number of error-handling issues, where grc wouldn't be initialized properly or not assigned error codes, at least in the "run" function. We need to fix those.

Output tlog-rec/tlog-play error messages after restoring terminal settings

Tlog-rec/tlog-play error messages generated in raw terminal mode need to be output after restoring terminal settings, otherwise they might appear garbled or in an inappropriate position on the terminal.

Add integration tests

To test tlog end-to-end add integration tests. Possible tests follow.

General Recording

For several payload limit setting values (minimum, default, large) check
that tlog-rec produces payloads within the limit, for both input and output,
for the following crossing the limit:
- single-byte UTF-8 characters,
- multi-byte UTF-8 characters,
- input byte sequences producing escape sequences on the output,
- invalid input byte sequences diverted into binary fields on the output,
- window resize records.
For several latency settings (minimum, default, long) check that tlog-rec
writes a record on time when there was any I/O or window resizes, and
doesn't when there wasn't any.
For each --log-* option check that turning it on or off reflects in tlog-rec
output. That is input/output/window resizes are (not) recorded as
configured.
Check that various rate-limiting settings affect tlog-rec log as documented.
That is:
- average rate is within 5% of configured over 10 seconds of recording,
- configured amount of burst I/O is allowed through,
- configured action (pass/delay/drop) is observed.
Check that tlog-rec can record to a file correctly, and responds
appropriately for various conditions preventing it from writing one.
Including:
- Filesystem permissions
- SElinux violation (assuming the session is somehow confined)
- Insufficient space on device
Check that tlog-rec can record to syslog correctly, and observes facility
and priority settings
Check that tlog-rec can record to journal correctly, observes the priority
setting, and adds the documented extra Journal fields correctly
Check that both terminal input and output recorded by tlog-rec and played
back by tlog-play are preserved, including:
- Valid 1-, 2-, 3-, and 4-byte UTF-8 byte sequences
- Invalid UTF-8 byte sequences
- Characters that must be escaped in JSON
- Window resizes
Check that terminal input, output, and window resize timing is preserved
when recorded by tlog-rec and played back by tlog-play.
Check that tlog-rec records and tlog-play faithfully reproduces, without
crashing/memory corruption, any input and output produced by a fuzzer.
Check that tlog-play can process any input produced by a fuzzer without
crashing or memory corruption.

Tlog-rec

Check that tlog-rec observes settings stored in its system-wide
configuration file.
Check that tlog-rec overrides settings from its system-wide configuration
file by ones found in file specified with TLOG_REC_CONF_FILE environment
variable.
Check that tlog-rec overrides settings from the file specified with
TLOG_REC_CONF_FILE environment variable by settings specified in
TLOG_REC_CONF_TEXT environment variable value.
Check that tlog-rec observes SHELL environment variable value, when no
command to record was specified.
Check that tlog-rec doesn't interpret options passed to the recorded program
as its own.

Tlog-rec-session

Check that tlog-rec-session doesn't interpret arguments passed to the shell
as its own, when started with -c option.
Check that tlog-rec-session doesn't interpret arguments passed to the shell
script to execute as its own, when started with arguments, but without the
-c option.
Check that tlog-rec-session starts a login shell when invoked with argv[0]
starting with dash -.
Check that tlog-rec-session starts a login shell when invoked with -l
option.
Check that tlog-rec-session observes settings stored in its system-wide
configuration file.
Check that tlog-rec-session observes the shell specified in its system-wide
configuration file.
Check that tlog-rec-session observes the notice text setting specified in
its system-wide configuration file.
Check that tlog-rec-session overrides settings from its system-wide
configuration file by ones found in file specified with
TLOG_REC_SESSION_CONF_FILE environment variable.
Check that tlog-rec-session overrides settings from the file specified with
TLOG_REC_SESSION_CONF_FILE environment variable by settings specified in
TLOG_REC_SESSION_CONF_TEXT environment variable value.
Check that tlog-rec-session observes TLOG_REC_SESSION_SHELL environment
variable value, overriding the shell specified in TLOG_REC_SESSION_CONF_TEXT
environment variable.

Tlog-play

Check that tlog-play can playback recordings stored in files.
Check that tlog-play can playback recordings stored in journal.
Check that tlog-play can playback recordings stored in ElasticSearch.
Check that tlog-play would tolerate missing log messages with lax option
enabled, and would not with it disabled.
Check that tlog-play observes -s/--speed option correctly.
Check that tlog-play observes -p/--paused option correctly.
Check that tlog-play observes all possible value types of -g/--goto option
correctly. Including negative and ones beyond the end of the recording.
Check that tlog-play observes -f/--follow option correctly with any log
reader.
Check that tlog-play observes -S/--journal-since, -U/--journal-until,
and -M/--journal-match options correctly.
Check that tlog-play reacts to control keys correctly during playback.
Check that tlog-play doesn't react to escape sequences in recording, which
would normally invoke a response from the terminal including characters that
are used to control tlog-play. See #147.

Most, if not all, integration tests should run within make check.

Add object validations to function exits

Add assertions validating manipulated/created objects to all function exit paths as an additional consistency tracking measure.

Ensure sterility of the environment inherited by the shell

Check that the shell spawned by tlog-rec gets only the environment it needs and doesn't inherit unnecessary file descriptors and other resources.

Make sure tlog can be used in a jump-server setup

As tlog-rec cannot be used safely and reliably to record superuser sessions another approach needs to be employed. One of those is using a "jump-server", where the user would first login to a special "jump" server where tlog-rec would be running, and from there (automatically?) login to the target server.

In this case some of the logged data might need to be altered, such as hostname, or the user name. Consider the whole setup, how it would work, and implement the extra flexibility, if necessary.

Merge separate messages when reading

When reading a packet from a JSON source, merge similar timing runs across messages. This can reduce the number of terminal write syscalls on playback, e.g. when a lot of output is produced and message size is not that big.

Implement playback from file in tlog-play

To aid testing both during development and by users, implement file playback in tlog-play

Add guarding assertions to transaction interface

Add assert invocations validating various types of the transaction API upon function entrances, exits and otherwise.

Add a version field to every message

To support future message format modification with backward compatibility, each tlog message needs to contain a version field.

Implement privilege separation in tlog-rec

Make tlog-rec executable SUID/SGID to a dedicated user/group. Drop back to the
original user/group before exec'ing the shell in the fork. This will keep
tlog-rec safe from the modification by the recorded user and would also allow
authenticated filtering of the log messages.

Upon RPM installation, a special user and group both named tlog should be
created, and tlog-rec should be owned by them and marked SUID and SGID.

After forking the shell process, tlog-rec should drop to the real UID/GID with
setuid(getuid()) and setgid(getgid()).

For additional security, consider retreiving the effective user/group IDs and
setting them to the real IDs with seteuid/setegid right after starting,
passing them around (say as euid/egid) and only setting them back with
seteuid/setegid when necessary. I.e. when reading the configuration and
when opening the logging socket/file.

Also consider calling setreuid(euid, euid)/setregid(egid, egid) in the
parent process after forking the shell, so the parent (recording) process
would not be able to regain the user's real IDs. However, ignore EPERM,
perhaps producing a warning, as POSIX says that might not be permitted, even
though it works on Linux and some BSDs.

To verify:

Check that all user/group IDs in /proc/PID/status of the user shell are
the user's IDs.
If setreuid/setregid use is implemented, check that all user/group IDs in
/proc/PID/status of the parent recording process are the tlog's IDs.
Otherwise, check that real IDs are user's IDs and others are tlog IDs.
Check that log file is created with tlog UID/GID, when using the file
writer.
Check that journald and rsyslog record tlog UID/GID as the logger's
credentials, and rsyslog can filter by them, when using the syslog writer.

Add -v/--version option support to tlog-rec and tlog-play

Check if conf schema can do without "default" flag

At the moment the configuration schema file has "default" flag for each parameter. However, it is only making sense for the "file" origin as only there can be a defaults file. Consider removing this flag and replacing it with something else. Perhaps have origin "none" or some such, which goes before "file" and means parameters without defaults.

Add a real time timestamp to each message

As tlog-rec always caches the recorded data for at least one second, the timestamp the log server adds lags behind and doesn't represent the exact time the input was done, which might make correlating with other log messages more difficult.

Implement adding a real time timestamp to each message signifying the real and absolute time the first timing entry has arrived.

Consider implementing playback correlated with audit

Consider implementing a playback interface which would allow correlating terminal I/O with at least audit messages, all fetched from ElasticSearch. Perhaps we can simply fetch any specified indexes, join them by absolute timestamp and allow scrolling them side-by-side. Say, a bunch of options to tlog-play.

Implement explicit "follow" mode control in tlog-play as in tail(1)

Make all C blocks use braces

Make sure all blocks in C source code use braces, including single-line if branches and loop bodies.

Make .spec file and tarball acceptable for inclusion into Fedora

Make the .spec file and distribution tarball acceptable into the Fedora project. See Fedora packaging guidelines and package addition guide.

Implement a GUI playback app/backend

Implement a (GNOME?) GUI playback app or a backend to tlog-play which would observe window sizes and terminal types through using a terminal widget such as VteTerminal, and resizing the window and/or font automatically, as recorded. Consider using XEmbed to embed an actual terminal and resizing it accordingly as a simpler approach.

Test the code with Coverity / Clang Analyzer

Use RedHat-supplied Coverity / Clang Analyzer service to test the code for issues, then apply the necessary fixes.

Check timing delay range is within TLOG_DELAY_MAX in source

Ensure session recordings are not duplicated

Implement limiting recording to one per session.

Perhaps have a tlog-only-writable directory (tlog-rec will be SUID when #5 is fixed) with files named after session IDs, containing tlog-rec PIDs. Session IDs uniquely identify logins within a single system boot. Tlog-rec also uses them in its JSON messages.

So, if tlog-rec is started and finds that this session already has a file in
that directory, and the process with the stored PID is alive, it just drops the
privileges and execs the user shell.

Add journald logging support to tlog-rec

Implement support for logging to journald in tlog-rec. Should take the form of a JSON writer type.

Implement recording throttling

It should be possible to limit tlog recording throughput so that neither local nor remote systems are overwhelmed with too many log messages and don't run out of space. The best way would be to have the logging server implement per-process rate limiting with flow control, i.e. without dropping the log messages. However, as the existing logging servers don't seem to do that, we'll need to implement that in tlog. The limit might be expressed in encoded bytes per second and observed at the writer level.

Sort out relation of opts/args conf origins in tlog-rec

Figure out the proper behavior and priority between "opts" and "args" configuration origins in tlog-rec. If not possible, consider making three origins instead.

Make tlog-rec return exit status of the shell

Limit timestamps passed to sinks

Add checks for timestamps passed to sinks to not exceed the maximum pos recognized by source, supposedly for delta to be within TLOG_DELAY_MAX.

Implement pluggable memory allocation

As per suggestion from Jakub, implement pluggable memory allocation in tlog if the library grows bigger and there is a potential for outside users.

Add proper command-line interface to tlog-play

Add a standard command-line interface to tlog-play, including options, online help and manpage. Consider using the same approach as for tlog-rec, schema and everything. Consider the necessity of a configuration file.

Figure out the origin timestamp JSON sink should use

At the moment the JSON sink uses the timestamp of the first written packet as the origin of all the messages. However the first packet's timestamp might get shifted if the packet was an incomplete character terminated later, resulting in the first message having non-zero "pos".

Make each message start with window size

Make each recorded message start with window size, so the stream can be played back starting from any message and the window would be resized accordingly.

Implement passing the shell name to tlog-rec via argv[0]

The simplest way to specify the shell to tlog-rec might be specifying a special shell for the user - one for each possible invoked shell. Such shell might start with "tlog-rec-shell-" and end with the shell name. E.g. "tlog-rec-shell-bash". The actual executable would be just a symlink to tlog-rec and tlog-rec would extract the shell to be invoked from argv[0]. This would make standalone setups easier to implement and wouldn't require using pam_env or pam_sss to pass the shell to tlog-rec via environment variables.

Handle running under an X session in tlog-rec

Tlog-rec shouldn't record when running under an X session - other software should be used to record the whole graphical session instead, not just terminals. Add a way to detect such invocations and simply exec the shell after dropping back to the original user.

Unify init/cleanup and create/destroy function presence

Check that all the "object types" have both init/cleanup and create/destroy functions where possible, for consistency.

Consider renaming "timing" to "meta"

The "timing" message field stores information on stream runs and window sizes, plus delays between them. Even though it is actually timing, it is also metadata for all the streams and forks involved, and might be better named as such from the perspective of the considerable code chunk that implements them. Check if "timing" might actually be better named "meta" both in messages and in the code.

Limit the whole message size in JSON sink

Provide an option to limiting the whole logged JSON message size in JSON sink, including all the fields and formatting. This can be done by formatting a dummy message with empty in/out_txt/bin fields, as soon as the "pos" field is known, and getting its length.

Check if FQDN retrieval is reliable/necessary

As Jakub suggested, the standard FQDN retrieval approach used by tlog-rec might be unreliable. Consider using a different method, having a fallback, and/or relying on a configured value.

Add comprehensive descriptions to all header files

Make sure all header files have comprehensive overview of the contents in the header, in Doxygen format.

Set SHELL environment variable in tlog-rec

Add setting SHELL environment variable to the actual shell in tlog-rec. Otherwise programs trying to spawn the user's shell would start tlog-rec instead.

Add Doxygen documentation generation to the build

To ship Doxygen documentation with the devel package and to make sure the doxygen documentation is always valid add the documentation generation to the build system, either to the "all" target, or to a special target.

Record terminal type as well

Add a field to every message specifying the type of the terminal it was recorded on. Check if the TERM environment variable can be used reliably or other approach is necessary.

Move common .m4 files out of include/tlog

Move the common .m4 files out of include/tlog and to a dedicated directory, if a more suitable name or location is found.

Abort tlog-rec and tlog-play if terminal uses non-UTF-8 charset

As tlog-rec doesn't support character encoding conversion yet, implement character encoding detection and abort tlog-rec if the terminal uses non-UTF-8 charset. Same goes for tlog-play.

An invocation of "locale charmap" can return the used charset name, but might be too burdensome and a lighter way may need to be found.

Check if tlog-rec can be used with sudo

See if tlog-rec can be used with sudo to any benefit. Check if there's anything tlog needs to do for that. E.g. tlog-rec might be assigned as the shell for a dedicated privileged user, which is not root, but would still need to start some specific shell.

Implement charset conversion on recording

As users can use ono of the multitude of character encodings and JSON only supports UTF-8, tlog-rec needs to convert the received data accordingly.

An invocation of "locale charmap" can return the used encoding name, but might be too burdensome and a lighter way may need to be found.

We can use iconv for conversion, but we need a way to detect and extract invalid characters for separate, binary storage. Iconv seems to be able to stop at the first invalid byte, which can be used to implement that. We can use two iconv descriptors: one normal and another discarding invalid characters (with "//IGNORE" encoding suffix) to somehow extract invalid bytes. However we need to check how far back all the required functionality goes and if it's available in RHEL6.