scribery / tlog Goto Github PK
View Code? Open in Web Editor NEWTerminal I/O logger
Home Page: http://scribery.github.io/tlog/
License: GNU General Public License v2.0
Terminal I/O logger
Home Page: http://scribery.github.io/tlog/
License: GNU General Public License v2.0
Tlog-rec needs to support running without a terminal connected to either or both of STDIN/STDOUT, but to e.g. a file or a pipe. This is particularly necessary for ssh sessions with command to execute. E.g. 'ssh host command arg' invocations.
Make tlog-rec and the future tlog-play configuration validation function return validation error string instead of printing it, so the function could be used in assertions.
Add assertions using it all over configuration loading and using routines.
Split JSON encoding streams into two forks each: one for text and another for binary, each having the same interface and working with the dispatcher. This would also require changing the timing string format to have symmetric operations for reading/skipping and would actually make it simpler. This can also make stream implementations simpler and more self-contained, even though the actual code size would increase.
Implement a test verifying the coding style of the source. Use Indent, ClangFormat, or similar.
Tlog-rec.c has a number of error-handling issues, where grc wouldn't be initialized properly or not assigned error codes, at least in the "run" function. We need to fix those.
Tlog-rec/tlog-play error messages generated in raw terminal mode need to be output after restoring terminal settings, otherwise they might appear garbled or in an inappropriate position on the terminal.
To test tlog end-to-end add integration tests. Possible tests follow.
-c
option.-c
option.-
.-l
lax
option-s/--speed
option correctly.-p/--paused
option correctly.-g/--goto
option-f/--follow
option correctly with any log-S/--journal-since
, -U/--journal-until
,-M/--journal-match
options correctly.Most, if not all, integration tests should run within make check
.
Add assertions validating manipulated/created objects to all function exit paths as an additional consistency tracking measure.
Check that the shell spawned by tlog-rec gets only the environment it needs and doesn't inherit unnecessary file descriptors and other resources.
As tlog-rec cannot be used safely and reliably to record superuser sessions another approach needs to be employed. One of those is using a "jump-server", where the user would first login to a special "jump" server where tlog-rec would be running, and from there (automatically?) login to the target server.
In this case some of the logged data might need to be altered, such as hostname, or the user name. Consider the whole setup, how it would work, and implement the extra flexibility, if necessary.
When reading a packet from a JSON source, merge similar timing runs across messages. This can reduce the number of terminal write syscalls on playback, e.g. when a lot of output is produced and message size is not that big.
To aid testing both during development and by users, implement file playback in tlog-play
Add assert invocations validating various types of the transaction API upon function entrances, exits and otherwise.
To support future message format modification with backward compatibility, each tlog message needs to contain a version field.
Make tlog-rec
executable SUID/SGID to a dedicated user/group. Drop back to the
original user/group before exec'ing the shell in the fork. This will keep
tlog-rec
safe from the modification by the recorded user and would also allow
authenticated filtering of the log messages.
Upon RPM installation, a special user and group both named tlog
should be
created, and tlog-rec
should be owned by them and marked SUID and SGID.
After forking the shell process, tlog-rec
should drop to the real UID/GID with
setuid(getuid())
and setgid(getgid())
.
For additional security, consider retreiving the effective user/group IDs and
setting them to the real IDs with seteuid
/setegid
right after starting,
passing them around (say as euid
/egid
) and only setting them back with
seteuid
/setegid
when necessary. I.e. when reading the configuration and
when opening the logging socket/file.
Also consider calling setreuid(euid, euid)
/setregid(egid, egid)
in the
parent process after forking the shell, so the parent (recording) process
would not be able to regain the user's real IDs. However, ignore EPERM,
perhaps producing a warning, as POSIX says that might not be permitted, even
though it works on Linux and some BSDs.
To verify:
setreuid
/setregid
use is implemented, check that all user/group IDs infile
syslog
writer.At the moment the configuration schema file has "default" flag for each parameter. However, it is only making sense for the "file" origin as only there can be a defaults file. Consider removing this flag and replacing it with something else. Perhaps have origin "none" or some such, which goes before "file" and means parameters without defaults.
As tlog-rec always caches the recorded data for at least one second, the timestamp the log server adds lags behind and doesn't represent the exact time the input was done, which might make correlating with other log messages more difficult.
Implement adding a real time timestamp to each message signifying the real and absolute time the first timing entry has arrived.
Consider implementing a playback interface which would allow correlating terminal I/O with at least audit messages, all fetched from ElasticSearch. Perhaps we can simply fetch any specified indexes, join them by absolute timestamp and allow scrolling them side-by-side. Say, a bunch of options to tlog-play.
Make sure all blocks in C source code use braces, including single-line if branches and loop bodies.
Make the .spec file and distribution tarball acceptable into the Fedora project. See Fedora packaging guidelines and package addition guide.
Implement a (GNOME?) GUI playback app or a backend to tlog-play which would observe window sizes and terminal types through using a terminal widget such as VteTerminal, and resizing the window and/or font automatically, as recorded. Consider using XEmbed to embed an actual terminal and resizing it accordingly as a simpler approach.
Use RedHat-supplied Coverity / Clang Analyzer service to test the code for issues, then apply the necessary fixes.
Implement limiting recording to one per session.
Perhaps have a tlog-only-writable directory (tlog-rec will be SUID when #5 is fixed) with files named after session IDs, containing tlog-rec PIDs. Session IDs uniquely identify logins within a single system boot. Tlog-rec also uses them in its JSON messages.
So, if tlog-rec is started and finds that this session already has a file in
that directory, and the process with the stored PID is alive, it just drops the
privileges and execs the user shell.
Implement support for logging to journald in tlog-rec. Should take the form of a JSON writer type.
It should be possible to limit tlog recording throughput so that neither local nor remote systems are overwhelmed with too many log messages and don't run out of space. The best way would be to have the logging server implement per-process rate limiting with flow control, i.e. without dropping the log messages. However, as the existing logging servers don't seem to do that, we'll need to implement that in tlog. The limit might be expressed in encoded bytes per second and observed at the writer level.
Figure out the proper behavior and priority between "opts" and "args" configuration origins in tlog-rec. If not possible, consider making three origins instead.
Add checks for timestamps passed to sinks to not exceed the maximum pos recognized by source, supposedly for delta to be within TLOG_DELAY_MAX.
As per suggestion from Jakub, implement pluggable memory allocation in tlog if the library grows bigger and there is a potential for outside users.
Add a standard command-line interface to tlog-play, including options, online help and manpage. Consider using the same approach as for tlog-rec, schema and everything. Consider the necessity of a configuration file.
At the moment the JSON sink uses the timestamp of the first written packet as the origin of all the messages. However the first packet's timestamp might get shifted if the packet was an incomplete character terminated later, resulting in the first message having non-zero "pos".
Make each recorded message start with window size, so the stream can be played back starting from any message and the window would be resized accordingly.
The simplest way to specify the shell to tlog-rec might be specifying a special shell for the user - one for each possible invoked shell. Such shell might start with "tlog-rec-shell-" and end with the shell name. E.g. "tlog-rec-shell-bash". The actual executable would be just a symlink to tlog-rec and tlog-rec would extract the shell to be invoked from argv[0]. This would make standalone setups easier to implement and wouldn't require using pam_env or pam_sss to pass the shell to tlog-rec via environment variables.
Tlog-rec shouldn't record when running under an X session - other software should be used to record the whole graphical session instead, not just terminals. Add a way to detect such invocations and simply exec the shell after dropping back to the original user.
Check that all the "object types" have both init/cleanup and create/destroy functions where possible, for consistency.
The "timing" message field stores information on stream runs and window sizes, plus delays between them. Even though it is actually timing, it is also metadata for all the streams and forks involved, and might be better named as such from the perspective of the considerable code chunk that implements them. Check if "timing" might actually be better named "meta" both in messages and in the code.
Provide an option to limiting the whole logged JSON message size in JSON sink, including all the fields and formatting. This can be done by formatting a dummy message with empty in/out_txt/bin fields, as soon as the "pos" field is known, and getting its length.
As Jakub suggested, the standard FQDN retrieval approach used by tlog-rec might be unreliable. Consider using a different method, having a fallback, and/or relying on a configured value.
Make sure all header files have comprehensive overview of the contents in the header, in Doxygen format.
Add setting SHELL environment variable to the actual shell in tlog-rec. Otherwise programs trying to spawn the user's shell would start tlog-rec instead.
To ship Doxygen documentation with the devel package and to make sure the doxygen documentation is always valid add the documentation generation to the build system, either to the "all" target, or to a special target.
Add a field to every message specifying the type of the terminal it was recorded on. Check if the TERM environment variable can be used reliably or other approach is necessary.
Move the common .m4 files out of include/tlog and to a dedicated directory, if a more suitable name or location is found.
As tlog-rec doesn't support character encoding conversion yet, implement character encoding detection and abort tlog-rec if the terminal uses non-UTF-8 charset. Same goes for tlog-play.
An invocation of "locale charmap" can return the used charset name, but might be too burdensome and a lighter way may need to be found.
See if tlog-rec can be used with sudo to any benefit. Check if there's anything tlog needs to do for that. E.g. tlog-rec might be assigned as the shell for a dedicated privileged user, which is not root, but would still need to start some specific shell.
As users can use ono of the multitude of character encodings and JSON only supports UTF-8, tlog-rec needs to convert the received data accordingly.
An invocation of "locale charmap" can return the used encoding name, but might be too burdensome and a lighter way may need to be found.
We can use iconv for conversion, but we need a way to detect and extract invalid characters for separate, binary storage. Iconv seems to be able to stop at the first invalid byte, which can be used to implement that. We can use two iconv descriptors: one normal and another discarding invalid characters (with "//IGNORE" encoding suffix) to somehow extract invalid bytes. However we need to check how far back all the required functionality goes and if it's available in RHEL6.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.