emichael / dslabs Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 322.0 19.38 MB

Distributed Systems Labs and Framework

Home Page: https://ellismichael.com/dslabs/

Makefile 0.55% Java 94.12% Python 3.53% Ruby 0.01% HTML 0.26% SCSS 1.52%

distributed-systems university-project

dslabs's People

Contributors

Stargazers

Watchers

Forkers

engmux rezaasjd nov11 gowtham614 njanderson dwoos marafin-umd luque muntasirraihan zyq11223 codealigned offgrid-konrad philipz ming535 graydon deepdeepdot deepankarsharma viktor-evdokimov joanzapata chaitanyaphalak andrewlee302 rmatil jake-bladt alexandra-tudor jianjin uwthaipq radamuspl kutsurak sarehp lintianshi op2018 anirbanbhowmik1511 radamus lalitkale mary3000 hydropenguin jitin17 quyuancatherine hantianz yxtanduke tweetieli yzha657 kelado wma8 shupreti vqng saiprashanth173 sanajahan nukeexplode pyroblazer rayzamgh andriancedric chris-billys15 kumaken wasabifan lukago mi-lee rstutsman frank96zhang patrickhhh99 binyuhuang-nju joyyyq baoqp mmaciejewski123 mkarpacka 7github7 jamesrohan ziweiwu shubhi-agl kangtao jgrimaldo3 aprilxiaoyanliu zubair527 benjamin-etheredge swar yonglezh-purdue nowei asif-rehan z-zhiqing jingxuandinga fagan2888 sicihan mlnhld guizhen-wang catherinetcai ruihong123 qizhang-1 heliwang mohammedahmadi1990 wanlipu piymis leo70kg johndepaszthory jsravn njorocs sammr0103 hasyimibhar chrisshyi henryhcheung1 silviuburz

dslabs's Issues

Migrate from numeric primitive times to Instant/Duration

We should be using Instant and Duration from the java.time package instead of millisecond times stored as longs or ints.

Gradle build fails

Describe the bug
Opening the project in IntelliJ Idea 2023.1 and running the gradle build fails with the below error

A problem occurred evaluating root project 'dslabs'.
Could not set unknown property 'classifier' for task ':scaffoldingSourcesJar' of type com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar.

Screenshots

Environment

OS: Windows 11
JDK (output of java --version): OpenJDK 14
DSLabs githash: e708af3

Lab 1: clarify what should be done when handling "old requests"

In this section, it is unclear what is the desired behavior for old requests. I was stuck on this for a bit and figured it out by experimenting with different behaviors until the tests passed.

You will also need to deal with what happens when the server receives "old" Requests.

I'd write up suggestions or a PR for how it could be improved but it's possible that the ambiguity is intended to be a part of the challenge.

Add search option to not exit after goal found

Throughout the search tests now, we have the pattern:

searchSettings.maxTimeSecs(30);
searchSettings.addGoal(CLIENTS_DONE);
searchSettings.addInvariant(RESULTS_OK);
bfs(initSearchState);
assertGoalFound();

searchSettings.clearGoals();
bfs(initSearchState);

If the first bfs is not instant, this wastes significant time. It would be nice to have searchSettings.stopOnGoal() and instruct the search to continue even if it hits a goal-matching state (while still logging the goal-matching state). This will require some re-architecture of SearchResults and BaseJUnitTest as well as Search.java. Additionally, trace minimization should probably be done on the main thread instead of worker threads.

Use failAndContinue liberally throughout tests

Most of the tests (with the exception of search tests with searches for goal states) exit when they fail. It would be better if we ran tests through to completion by reporting failures with the failAndContinue method that was recently added. (Of course, some tests must fail because a precondition for continuing wasn't met.)

Additional requirements:

Cleanup BaseJUnitTest#failedSearchTest. It's no longer needed if failAndContinue exists.
Failures should be printed as they happen so that the errors are in their proper context. They shouldn't be printed twice though. This might have to be done with a static IdentityHashSet that tracks which exceptions have been printed. This isn't the current behavior of failAndContinue.
Users should be able to disable this behavior and get all tests to fail fast with a flag to run_tests.py.

Confusion about the network of dslabs

I read Network.java, RunSettings.java, RunState.java and TestSetting.java. But I did not see the code about delays, duplications, and reorderings for messages. Is the network in dslabs not completely asynchronous?

Question about Lab3 Paxos

Does Lab3 Paxos depend on Lab2 PrimaryBackup？Can just skip Lab2？：）

Debugibility

Hi, I am currently a graduate student in Georgia Tech. We are using your system for our distributed computing course this semester. I have finish lab1 - lab3 currently. When I was doing labs, I found it can vary hard to debug about the search tests, especially for the BFS search. I have found a way to log the event sequence for the finished state between several BFS. And I used it by manually input in the vis debug. After I finish lab 4, I might try to add the log function that students can input the event sequences (copy and paste a series of events for BFS state) in the vis debug. Can you give me some suggestions about where to start? Thank you.

Add Checker Framework nullness checker

Add a GitHub action check that runs the nullness checker
Standardize on a set of null/notnull annotations to use, ban all uses of others with an automatic check

ubuntu make fails; windows 11 gradlew.bat fails

ubuntu
java -version ok
javac -version ok
make fails

windows 11
java -version ok
javac -vesion ok
gradlew.bat fails.

Non-working after first command issued, for both Ubuntu and Windows 11, following install with git clone https://github.com/emichael/dslabs.git

Describe the bug
A clear and concise description of what the bug is. Please be as specific as possible and make sure you've read the project's documentation in its entirety before submitting.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment

OS: [e.g. Linux, MacOS]
JDK (output of java --version):
DSLabs githash:

Feature Request: add detail messages for more/all invariant predicates

Currently, some invariant predicates do not have detail messages. The most prominent one is the APPENDS_LINEARIZABLE predicate. It could be useful to explain why the invariant was violated (e.g., for APPENDS_LINEARIZABLE, we could write something like "AppendResult(y) is not a valid result for Append(x)", or "AppendResult(y) for client 2 is inconsistent with AppendResult(x) for client 1"). The other invariant I see is MULTI_GETS_MATCH.

These omissions might be intentional (e.g., maybe the goal is to get students to identify what's wrong in the sequence), so I'm not opening a pull request yet.

Visual Debugger

Hello, how long should it take until the list of servers appears and is ready to start the visual debugger? It stays pending for me. I ran python3 run-tests.py --lab 0 --debug 1 1 GET:foo and python3 run-tests.py --lab 1 --debug 1 1 GET:foo. Are we supposed to implement anything first? I have not modified the code yet.

Environment:
Python: 3.7.3
Java:
openjdk 14 2020-03-17
OpenJDK Runtime Environment (build 14+36-1461)
OpenJDK 64-Bit Server VM (build 14+36-1461, mixed mode, sharing)
OS: Debian GNU/Linux 10 (buster)
browser: Google Chrome 87.0.4280.88

Allow for multiple rows of node states, reordering nodes

Right now, all nodes are laid out in a single row. On very wide screens, this is fine. On smaller screens, trying to view more than 4 nodes at once is difficult. Nodes usually don't need the entire vertical height of the screen. There is quite a lot of white space in the node state box, and scrolling through the message inbox/timer queue above a node is preferable to scrolling horizontally when space is tight.

It would be nice to be able to layout SingleNodePanels in multiple rows. I attempted a quick and dirty version where there were either one or two rows, and when there were two rows, they were separated by a JSplitPane. This ran into some bizarre issues. I think the "right" way is to use multiple JXMultiSplitPanes, one in vertical mode to hold the rows, and others in horizontal mode to hold each row, and use this trick to display only one row.

dslabs/framework/tst/dslabs/framework/testing/newviz/DebuggerWindow.java

Lines 357 to 379 in 54f9fc1

 // XXX: Disgusting hack to show only 1 leaf node 

 if (numShown == 1) { 

 Leaf l = new Leaf("DUMMY-LEAF-1"); 

 l.setWeight(0.0); 

 layoutNodes.add(new Divider()); 

 layoutNodes.add(l); 

 } 

 split.setChildren(layoutNodes); 

 splitPane.setModel(split); 

 for (Address a : addresses) { 

 if (!nodesActive.get(a).isSelected()) { 

 continue; 

 } 

 splitPane.add(statePanels.get(a), a.toString()); 

 } 

 if (numShown == 1) { 

 splitPane.add(new JPanel(), "DUMMY-LEAF-1"); 

 layout.displayNode("DUMMY-LEAF-1", false); 

 }

The other piece that would be nice is a way for users to reorder nodes. This could be as simple as replacing the "Show/hide nodes" panel in the sidebar with a reorderable list of checkboxes. This tutorial might be helpful: http://www.java2s.com/Tutorial/Java/0240__Swing/Usedraganddroptoreorderalist.htm The best version of that feature would be the ability to drag and drop SingleNodePanels by dragging the node name, but that seems very difficult.

Visual debugger display too small on Arch with Xmonad WM

Describe the bug
When running the visual debugger using either ./run-tests.py .. --debug ... or ./run-tests.py .. --start-viz ... the display is very small and cannot be resized.

Screenshots

Environment

OS: Arch Linux
Window manager: XMonad
JDK (output of java --version): openjdk 17.0.6 2023-01-17

Lab 1 part 2 question

Why is it necessary to use a sequence number in messages? since I don't maintain state on the server and a request can be executed multiple times.

Or just use it to control what is the response expected by the client in handleReply() and onClientTimer().

Thanks.

Don't count invariant computation time against clients' wait times

Tests like this one

dslabs/labs/lab3-paxos/tst/dslabs/paxos/PaxosTest.java

Lines 701 to 739 in e708af3

 public void test16SinglePartition() throws InterruptedException { 

 final int nClients = 5, nServers = 5; 

 setupStates(nServers); 

 runSettings.addInvariant(RESULTS_OK); 

 runState.start(runSettings); 

 // Startup the clients 

 for (int i = 1; i <= nClients; i++) { 

 runState.addClientWorker(client(i), differentKeysInfiniteWorkload, 

 false); 

 } 

 Thread.sleep(5000); 

 assertRunInvariantsHold(); 

 // Partition off some servers and the clients 

 List<Address> partition = 

 Lists.newArrayList(server(1), server(2), server(3)); 

 for (int i = 1; i <= nClients; i++) { 

 partition.add(client(i)); 

 } 

 runSettings.partition(partition); 

 Thread.sleep(1000); 

 assertRunInvariantsHold(); 

 // Heal the partition 

 runSettings.reconnect(); 

 Thread.sleep(5000); 

 // Shut the clients down 

 runState.stop(); 

 runSettings.addInvariant(LOGS_CONSISTENT); 

 assertRunInvariantsHold(); // report invariant errors first 

 assertMaxWaitTimeLessThan(3000); 

 }

evaluate invariants first before calculating each client's maximum wait time. There will always be an outstanding request at the end of the run, so the time it takes to actually compute the invariants will be added to the latency of that last request.

We still want invariant violations to take precedence over maxWaitTime violations, though. That is, if a test run violates an invariant and has a client max wait time greater than the limit, the invariant violation should always be reported. We need to decide on some mechanism for doing that and then fix all of the tests with this pattern.

Print formatted results to file

All of the current grading infrastructure is based on the output from stderr/stdout. This is suboptimal for a couple reasons. First, it means that the output must be parsed by grading scripts, which is error prone and subject to potential mischief. Second, stderr/stdout necessarily contain all of students' logging statements. Asking students to disable all logging before submitting has historically not had a high success rate, and sufficiently verbose logging can result in very large outputs of test runs.

There should be a JUnit test listener which, when a global flag is set, logs results as they happen and at the end of a test run outputs them to a file in a structured format (probably JSON). This output should optionally contain a copy of stdout/stderr, which can be obtained with the TeeStdOutErr utility. I'm not sure what the default should be.

Both of these configuration options should probably be accessible through run_tests.py. Students might want to use this test output themselves.

Ultimately, some sort of schema for the test output would be really useful. It would likely evolve over time, but at least we would have something that grading scripts (in this repo and developed by other instructors for their use cases) could reference.

Might be nice to support filters on the events view

For example, "show me only messages of these few types".

Cleanup JTrees.java

That file is a minefield and could use some careful restructuring.

Don't replace timers on update, highlight new timers

The new viz tool currently replaces all timers on any state change. It also does not highlight newly added timers like it does with messages. Part of the issue is that timers are trickier to deal with since queues are semi-ordered and can have duplicates.

See:

dslabs/framework/tst/dslabs/framework/testing/newviz/SingleNodePanel.java

Lines 175 to 187 in 54f9fc1

 /* 

  TODO: do the same thing for timers... 

  This is tricky, though. Messages are unique because of the network 

  model. But for timers, there might be multiple copies of the same 

  one in the queue. We need to diff the lists intelligently to make 

  sure we only pickup the newly added timers. We know the new ones are 

  at the back of the list, but if the most recent event was a timer 

  delivery, that makes things nasty. 

  Just updating the list with the new one is easy, but giving timers 

  the correct TreeDisplayType is hard. 

  */

On small screens the event filtering box is partially off the screen and does not scroll

Can we make it scroll? For now the workaround is to manually minimize other boxes in the left-hand column.

Make problems while installing dslabs on M1 ARM processors.

While installing the dslabs framework on my system(M1 Pro Macbook), I came across a few problems.

wget is not preinstalled on MacBooks and this creates an error where the system fails to make the build files due to the non-presence of wget.
gcp -> This toolkit is a fancy version of cp but at the same time is more complicated than cp. While running the gcp command, the command was unable to find a directory. The change I made was to remove gcp and replace it with cp in the MakeFile. That solved the problem and I successfully built the framework to work.

I think this should be added to the wiki since most of the students are operating ARM machines. I don't think this is a ARM specific error but maybe gcp misbehaves while cherry-picking files and transferring them.

Show delivered messages when opened viz is started with a trace

Right now, "View delivered messages" is disabled by default on startup. If the new viz tool is started with a trace that delivers duplicate messages, then when you're stepping through the trace, it looks like those messages come out of nowhere. There are two options here; both have merit.

Detect if the trace a DebuggerWindow is opened with uses duplicate message. If so, enable "View delivered messages" on startup.
Always display the "next" event in the linear history displayed by the events panel, even if the event is a duplicate message and "View delivered messages" is disabled by default.

Don't make test numbers dependent on filters

Currently, if you filter out any tests (e.g., by choosing a part of the lab or by only running search tests etc.), all of the tests are renumbered. This is because test numbers are assigned by JUnit after all of the filtering takes place. It would be much nicer to have a consistent numbering scheme. Test numbers should probably have "fully qualified" names, assigned by annotation. For example, the "number" of test 4 in part 2 would be "2.4" (a String rather than an int).

Then, from ./run-tests.py if --part is specified, individual tests can be selected without referring to the fully qualified name (i.e., you can specify --lab 1 --part 2 -n 4 or --lab 1 -n 2.4 but not lab 1 -n 4).

A complication is labs with only one part. Their test numbers shouldn't be 1.1, 1.2 etc. but just 1, 2. And referring to tests this way in ./run-tests.py should be valid.

It would also be nice to sort tests based on the numbering annotation rather than method name. Test numbers could then be removed from method names. I'm not sure if this is possible with the version of JUnit we're using.

Once question is whether the annotation should have the fully qualified name, or a simple integer and pick up the part number from the test class. There are arguments for both approaches, but having the fully qualified name (e.g., @TestNumber("2.4")) is probably best because it allows students to easily look at the method and know how to run it.

Lastly, it would be great if we could validate in an automated test that test numbers are sequential, there are no duplicates, every test has a number, etc. This should go in the tst-self directory.
https://github.com/emichael/dslabs/tree/master/framework/tst-self/

Switch to Google Java formatter

The weirdnesses in IntelliJ's formatter keep cropping up, and there are changes with major version releases. We should standardize on the Google Java formatter. This means:

Enforcing formatting with a GitHub action check.
Removing the formatting files from the maintainer's and student's IntelliJ settings dirs.
Creating a make format or similar target that applies for formatter.
Setting up Google Java formatting in IntelliJ and adding the necessary settings to the maintainer's settings dir.

Search Tests Hanging without Logging

Hi
I am a graduate student at Georgia Tech and we are using these labs as part of our Distributed Computing course. I often see the search tests getting hung up. But when I run them with FINEST logging, they terminate. The earlier hanging up is probably due to some thread getting stuck.
An example being as follows:

`vpb@vpb-Inspiron-7560:~/gatech/sem2/cs7210/assignments/7210-assignments$ ./run-tests.py --lab 3 --test 21

TEST 21: Single client, no progress in minority [SEARCH] (15pts)

Starting breadth-first search...
Explored: 0, Depth: 0 (0.01s, 0.00K states/s)
Explored: 24085, Depth: 7 (5.01s, 4.81K states/s)
Explored: 59821, Depth: 8 (10.01s, 5.98K states/s)
Explored: 89775, Depth: 9 (15.05s, 5.96K states/s)
Explored: 109218, Depth: 9 (20.74s, 5.27K states/s)
Explored: 147122, Depth: 9 (25.85s, 5.69K states/s)
Explored: 158166, Depth: 9 (30.00s, 5.27K states/s)
Search finished.

Starting breadth-first search...
Explored: 1, Depth: 0 (0.00s, 1.00K states/s)
Explored: 37368, Depth: 173 (6.74s, 5.54K states/s)
Explored: 82224, Depth: 256 (17.14s, 4.80K states/s)
Explored: 103777, Depth: 288 (28.56s, 3.63K states/s)
Explored: 113273, Depth: 301 (30.00s, 3.78K states/s)
Search finished.

...PASS (60.32s)

Tests passed: 1/1
Points: 15/15 (100.00%)
Total time: 60.331s

ALL PASS

vpb@vpb-Inspiron-7560:~/gatech/sem2/cs7210/assignments/7210-assignments$ ./run-tests.py --lab 3 --test 21

TEST 21: Single client, no progress in minority [SEARCH] (15pts)

Starting breadth-first search...
Explored: 0, Depth: 0 (0.01s, 0.00K states/s)
Explored: 21888, Depth: 7 (5.01s, 4.37K states/s)
Explored: 25142, Depth: 7 (10.01s, 2.51K states/s)
Explored: 25142, Depth: 7 (15.01s, 1.68K states/s)
Explored: 25142, Depth: 7 (20.01s, 1.26K states/s)
Explored: 25142, Depth: 7 (25.01s, 1.01K states/s)
^CTraceback (most recent call last):
File "./run-tests.py", line 183, in
main()
File "./run-tests.py", line 179, in main
assertions=args.assertions)
File "./run-tests.py", line 90, in run_tests
subprocess.call(command)
File "/usr/lib/python2.7/subprocess.py", line 172, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 1099, in wait
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
File "/usr/lib/python2.7/subprocess.py", line 125, in _eintr_retry_call
return func(*args)
KeyboardInterrupt
`
But when I run these with logging enabled, they consistently pass. Is there any framework issue related to this?

Possible deadlock in the framework with the search test

When running some students' solutions with the latest dslab framework, I noticed that the search test would get stuck. e.g. A 30 seconds timeout test would run forever. Using jstack to look into the threads, I found a deadlock between the search threads. It seems they are all blocked at Search.java:499 discovered.add(successor.wrapped()).

The following is printed by jstack

Found one Java-level deadlock:

"Thread-12":
waiting to lock monitor 0x00007fa138008f00 (object 0x0000000780b1c3f8, a java.util.Hashtable),
which is held by "Thread-17"

"Thread-17":
waiting to lock monitor 0x00007fa13800f400 (object 0x0000000780b2a3b0, a java.util.Hashtable),
which is held by "Thread-16"

"Thread-16":
waiting to lock monitor 0x00007fa138009400 (object 0x0000000781588248, a java.util.Hashtable),
which is held by "Thread-14"

"Thread-14":
waiting to lock monitor 0x00007fa13800f400 (object 0x0000000780b2a3b0, a java.util.Hashtable),
which is held by "Thread-16"

Java stack information for the threads listed above:

"Thread-12":
at java.util.Hashtable.hashCode([email protected]/Hashtable.java:864)
- waiting to lock <0x0000000780b1c3f8> (a java.util.Hashtable)
at dslabs.primarybackup.PBServer.hashCode(PBServer.java:19)
at java.util.Objects.hashCode([email protected]/Objects.java:117)
at java.util.HashMap$Node.hashCode([email protected]/HashMap.java:298)
at java.util.AbstractMap.hashCode([email protected]/AbstractMap.java:527)
at dslabs.framework.testing.AbstractState.hashCode(AbstractState.java:49)
at dslabs.framework.testing.search.SearchState.hashCode(SearchState.java:68)
at dslabs.framework.testing.search.SearchState$SearchEquivalenceWrappedSearchState.hashCode(SearchState.java:623)
at java.util.concurrent.ConcurrentHashMap.putVal([email protected]/ConcurrentHashMap.java:1012)
at java.util.concurrent.ConcurrentHashMap.put([email protected]/ConcurrentHashMap.java:1006)
at java.util.Collections$SetFromMap.add([email protected]/Collections.java:5654)
at dslabs.framework.testing.search.BFS.exploreNode(Search.java:499)
at dslabs.framework.testing.search.BFS.lambda$getWorker$0(Search.java:479)
at dslabs.framework.testing.search.BFS$$Lambda$133/0x0000000800c07440.run(Unknown Source)
at dslabs.framework.testing.search.Search.lambda$run$0(Search.java:275)
at dslabs.framework.testing.search.Search$$Lambda$132/0x0000000800c07040.run(Unknown Source)
at java.lang.Thread.run([email protected]/Thread.java:832)
"Thread-17":
at java.util.Hashtable.size([email protected]/Hashtable.java:248)
- waiting to lock <0x0000000780b2a3b0> (a java.util.Hashtable)
at java.util.Hashtable.equals([email protected]/Hashtable.java:822)
- locked <0x0000000780b1c3f8> (a java.util.Hashtable)
at dslabs.primarybackup.PBServer.equals(PBServer.java:19)
at java.util.AbstractMap.equals([email protected]/AbstractMap.java:493)
at dslabs.framework.testing.AbstractState.equals(AbstractState.java:49)
at dslabs.framework.testing.search.SearchState.equals(SearchState.java:68)
at java.util.Objects.equals([email protected]/Objects.java:78)
at dslabs.framework.testing.search.SearchState$SearchEquivalenceWrappedSearchState.equals(SearchState.java:606)
at java.util.concurrent.ConcurrentHashMap.putVal([email protected]/ConcurrentHashMap.java:1039)
- locked <0x00000007814a8c08> (a java.util.concurrent.ConcurrentHashMap$Node)
at java.util.concurrent.ConcurrentHashMap.put([email protected]/ConcurrentHashMap.java:1006)
at java.util.Collections$SetFromMap.add([email protected]/Collections.java:5654)
at dslabs.framework.testing.search.BFS.exploreNode(Search.java:499)
at dslabs.framework.testing.search.BFS.lambda$getWorker$0(Search.java:479)
at dslabs.framework.testing.search.BFS$$Lambda$133/0x0000000800c07440.run(Unknown Source)
at dslabs.framework.testing.search.Search.lambda$run$0(Search.java:275)
at dslabs.framework.testing.search.Search$$Lambda$132/0x0000000800c07040.run(Unknown Source)
at java.lang.Thread.run([email protected]/Thread.java:832)
"Thread-16":
at java.util.Hashtable.size([email protected]/Hashtable.java:248)
- waiting to lock <0x0000000781588248> (a java.util.Hashtable)
at java.util.Hashtable.equals([email protected]/Hashtable.java:822)
- locked <0x0000000780b2a3b0> (a java.util.Hashtable)
at dslabs.primarybackup.PBServer.equals(PBServer.java:19)
at java.util.AbstractMap.equals([email protected]/AbstractMap.java:493)
at dslabs.framework.testing.AbstractState.equals(AbstractState.java:49)
at dslabs.framework.testing.search.SearchState.equals(SearchState.java:68)
at java.util.Objects.equals([email protected]/Objects.java:78)
at dslabs.framework.testing.search.SearchState$SearchEquivalenceWrappedSearchState.equals(SearchState.java:606)
at java.util.concurrent.ConcurrentHashMap.putVal([email protected]/ConcurrentHashMap.java:1039)
- locked <0x0000000781565b00> (a java.util.concurrent.ConcurrentHashMap$Node)
at java.util.concurrent.ConcurrentHashMap.put([email protected]/ConcurrentHashMap.java:1006)
at java.util.Collections$SetFromMap.add([email protected]/Collections.java:5654)
at dslabs.framework.testing.search.BFS.exploreNode(Search.java:499)
at dslabs.framework.testing.search.BFS.lambda$getWorker$0(Search.java:479)
at dslabs.framework.testing.search.BFS$$Lambda$133/0x0000000800c07440.run(Unknown Source)
at dslabs.framework.testing.search.Search.lambda$run$0(Search.java:275)
at dslabs.framework.testing.search.Search$$Lambda$132/0x0000000800c07040.run(Unknown Source)
at java.lang.Thread.run([email protected]/Thread.java:832)
"Thread-14":
at java.util.Hashtable.size([email protected]/Hashtable.java:248)
- waiting to lock <0x0000000780b2a3b0> (a java.util.Hashtable)
at java.util.Hashtable.equals([email protected]/Hashtable.java:822)
- locked <0x0000000781588248> (a java.util.Hashtable)
at dslabs.primarybackup.PBServer.equals(PBServer.java:19)
at java.util.AbstractMap.equals([email protected]/AbstractMap.java:493)
at dslabs.framework.testing.AbstractState.equals(AbstractState.java:49)
at dslabs.framework.testing.search.SearchState.equals(SearchState.java:68)
at java.util.Objects.equals([email protected]/Objects.java:78)
at dslabs.framework.testing.search.SearchState$SearchEquivalenceWrappedSearchState.equals(SearchState.java:606)
at java.util.concurrent.ConcurrentHashMap.putVal([email protected]/ConcurrentHashMap.java:1039)
- locked <0x0000000780f7f258> (a java.util.concurrent.ConcurrentHashMap$Node)
at java.util.concurrent.ConcurrentHashMap.put([email protected]/ConcurrentHashMap.java:1006)
at java.util.Collections$SetFromMap.add([email protected]/Collections.java:5654)
at dslabs.framework.testing.search.BFS.exploreNode(Search.java:499)
at dslabs.framework.testing.search.BFS.lambda$getWorker$0(Search.java:479)
at dslabs.framework.testing.search.BFS$$Lambda$133/0x0000000800c07440.run(Unknown Source)
at dslabs.framework.testing.search.Search.lambda$run$0(Search.java:275)
at dslabs.framework.testing.search.Search$$Lambda$132/0x0000000800c07040.run(Unknown Source)
at java.lang.Thread.run([email protected]/Thread.java:832)

Found 1 deadlock.

Add a demo of @Log to Lab 0

It would be nice to introduce students to proper logging infrastructure using @Log. I think the best place for this would be a short writeup as part of Lab 0, that we probably wouldn't cover explicitly in week 1 (because student heads are already full) but we can point back to later while students are working on Lab 2 ish. Generally, student log messages should be at level FINE or above.

The writeup should also include a description of why logging is better than println. The main reason being that they are off by default (when properly leveled), and in particular they are off when run in gradescope. (Every quarter we have a handful of students submit solutions that call println in every event, generating gigabytes of log data. Our gradescope script parses the log, and it will choke if the log does not fit in memory (which on Gradescope is a couple of gigs at most).)

Use MacOS Menu Bar

On MacOS, we should use the system menu bar in the new viz tool instead of creating a menu bar in swing and docking it at the top of the window.

Error running make: `gcp` is not recognized on zsh shell

Describe the bug
The file copier tool gcp is not recognized on zsh shell on MacOS 13.2.1 (the problem may be more general, but at least covers this extent). For Darwin OS, the preferred file copier ( $(CP) )on dslabs Makefile is gcp instead of cp. However, make clean all fails when gcp is attempted to be used (in line: $(CP) -r labs handout-files/. $(OTHER_FILES) $@). On changing gcp to cp the problem is fixed. Of course, changing to cp might break other cases for which you probably had gcp in the first place, but there should at least be a line to check in the Makefile whether the current shell recognizes gcp or not. Users should be able to run make out of the box once they have the required/recommended tools set up (Python3, Make, Java 14, IntelliJ)

Screenshots

Environment

OS: MacOS Ventura 13.2.1
JDK (output of java --version):
java 17.0.4.1 2022-08-18 LTS
Java(TM) SE Runtime Environment (build 17.0.4.1+1-LTS-2)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.4.1+1-LTS-2, mixed mode, sharing)
DSLabs githash: 35df0e9 (master branch)

	// XXX: Disgusting hack to show only 1 leaf node
	if (numShown == 1) {
	Leaf l = new Leaf("DUMMY-LEAF-1");
	l.setWeight(0.0);
	layoutNodes.add(new Divider());
	layoutNodes.add(l);
	}

	split.setChildren(layoutNodes);

	splitPane.setModel(split);

	for (Address a : addresses) {
	if (!nodesActive.get(a).isSelected()) {
	continue;
	}
	splitPane.add(statePanels.get(a), a.toString());
	}

	if (numShown == 1) {
	splitPane.add(new JPanel(), "DUMMY-LEAF-1");
	layout.displayNode("DUMMY-LEAF-1", false);
	}

	public void test16SinglePartition() throws InterruptedException {
	final int nClients = 5, nServers = 5;

	setupStates(nServers);

	runSettings.addInvariant(RESULTS_OK);
	runState.start(runSettings);

	// Startup the clients
	for (int i = 1; i <= nClients; i++) {
	runState.addClientWorker(client(i), differentKeysInfiniteWorkload,
	false);
	}

	Thread.sleep(5000);
	assertRunInvariantsHold();

	// Partition off some servers and the clients
	List<Address> partition =
	Lists.newArrayList(server(1), server(2), server(3));
	for (int i = 1; i <= nClients; i++) {
	partition.add(client(i));
	}
	runSettings.partition(partition);
	Thread.sleep(1000);
	assertRunInvariantsHold();

	// Heal the partition
	runSettings.reconnect();
	Thread.sleep(5000);

	// Shut the clients down
	runState.stop();

	runSettings.addInvariant(LOGS_CONSISTENT);

	assertRunInvariantsHold(); // report invariant errors first
	assertMaxWaitTimeLessThan(3000);
	}

	/*
	TODO: do the same thing for timers...

	This is tricky, though. Messages are unique because of the network
	model. But for timers, there might be multiple copies of the same
	one in the queue. We need to diff the lists intelligently to make
	sure we only pickup the newly added timers. We know the new ones are
	at the back of the list, but if the most recent event was a timer
	delivery, that makes things nasty.

	Just updating the list with the new one is easy, but giving timers
	the correct TreeDisplayType is hard.
	*/

emichael / dslabs Goto Github PK

dslabs's People

Contributors

Stargazers

Watchers

Forkers

dslabs's Issues

`vpb@vpb-Inspiron-7560:~/gatech/sem2/cs7210/assignments/7210-assignments$ ./run-tests.py --lab 3 --test 21

...PASS (60.32s)

ALL PASS

vpb@vpb-Inspiron-7560:~/gatech/sem2/cs7210/assignments/7210-assignments$ ./run-tests.py --lab 3 --test 21

Found one Java-level deadlock:

Java stack information for the threads listed above:

Recommend Projects

Recommend Topics

Recommend Org