Giter Club home page Giter Club logo

Comments (3)

lrhn avatar lrhn commented on August 15, 2024

Also reproduces on Linux for me. Since it exists on more than one OS, it's probably a Dart issue, not an OS issue.
Seems to depend on size of file-entries in directory. I changed the filename format to file#### with leading zeros, so the number of files doesn't change the file name length.
I also open the FIFO for writing ahead of time, just to be prepared to close it.

With that, the problem starts at 45 files, with one file not being deleted.
(It seems that it was opening the FIFO that change the number from 46 to 45, so maybe it's related to open file descriptors.)

My test program below has the following behavior for different temporary file counts:

  • 45+: One undeleted file, except 60 with no undeleted files
  • 78+: Two undeleted files. At 89, 108 directory list stalls.
  • 109+: Three undeleted files. At 121, 124 directory list stalls.
  • 140+: Four undeleted files. At 152, 153 directory list stalls.
  • 172+: Five undeleted files. At 184, 185 directory list stalls.
  • 204+: Six undeleted files. Didn't check any further.

That's a consistent 31/32 distance between changes, which suggests that it's related to some kind of block size. (Or chunks of open file descriptors.)
That's corroberated by the files that are not deleted consistently being at the same position in the iteration for any file-count, but differs for different file counts.

My best guess is that the async list and multiple async deletes in the same directory are somehow introducing a race condition, that makes something go wrong.

Maybe the delete tries to remove a file at the same time that the previous delete causes the directory structure to be updated, and then the second delete fails to update the directory structure after deleting the file. Or the directory structure is updated and compressed just as the iteration is about to switch to a new sector, which is now no longer part of the directory.
That all depends on the individual operations not being atomic, and iteration not being stable against deletes (it probably is against atomic deletes, but if the delete isn't atomic, who knows what happens.)

Or maybe it's Dart reusing file descriptors. I know MacOS is known for having a low number of available file descriptors, so if Dart tries to reuse, and does so too early, that might have an effect. (But that's guessing blindly, only based on that doing fifo.openWrite() makes the problems start at 45 instead of 46.

Why that depends on having an open FIFO is where it gets weird.
Or rather, depends on having an open FIFO file read that is blocked on missing input. If I close the FIFO file by fifo.openWrite().close(); before the deletion step, the issue goes away. (Doing the fifo-write-close after deleting prevents the subscription.cancel() from hanging, but doesn't avoid the failing deletions.)

So something goes wrong with deleting, and sometimes with iterating.

⚠️ Notice ⚠️: The pattern used here, of asynchronously iterating the directory, deleting files asynchronously while iterationg, and just storing the futures until after the loop, is known to be unsafe. Not because of this weird behavior, which might still be a bug, but because it's storing a future with no handler across an asynchronous gap. If that gap takes more time than expected, and a future completes with an error, that error becomes unhandled because the handler isn't added until after the loop completes. (Which is why the error becomes uncaught in the 109-file case where the loop takes too long to complete).
The code should at least do .ignore() on the future before storing it.

The heavily instrumented test code I used:

import 'dart:async';
import 'dart:io';

final Stopwatch sw = Stopwatch();
void log(String text) {
  var e = sw.elapsedMilliseconds;
  print("(${"$e".padLeft(3)}): $text");
}

// Args[0] is number of temporary files.
// 45+: One undeleted file
//   - 60: No undeleted files
// 78+: Two undeleted files
//   - 89, 108: Stalls in `list`
// 109+: Three undeleted files
//   - 121, 124: Stalls in `list`
// 140+: Four undeleted files
//   - 152, 153: Stalls in `list`
// 172+: Five undeleted files
//   - 184, 185: Stalls in `list`
// 204+: Six undeleted files
void main(List<String> args) async {
  // Safety wrapper around real main, to catch uncaught errors
  // and exit if stalled.
  var exitCode = await runZonedGuarded(() => _main(args), (e, s) {
    log("Uncaught error: $e\n$s");
  }, zoneValues: {#_ez: "Own error zone"})!
      .timeout(Duration(milliseconds: 500), onTimeout: () {
    log("Stalled, exiting");
    throw exit(0);
  });
  exit(exitCode);
}

Future<int> _main(List<String> args) async {
  var exitCode = 0;
  var fileCount = 45;
  if (args.isNotEmpty) fileCount = int.tryParse(args.first) ?? fileCount;

  sw.start();

  log("Creating named pipe or file");
  final (pipeDir, fifo) = await createNamedPipe();
  log("Listen to named pipe or file");

  final openRead = fifo.openRead().listen((v) {
    log("FIFO EVENT: $v");
  }, onError: (e, s) {
    log("FIFO ERROR: $e");
  }, onDone: () {
    log("FIFO DONE");
  });
  final openWrite = await fifo.openWrite(mode: FileMode.write);

  final dir = await createTmpFiles(fileCount);

  try {
    await deleteFiles(dir, fileCount);
  } on ParallelWaitError<List<void>, List<Object?>> catch (e) {
    exitCode = 1;
    log('Timeout while deleting files: ${e.errors.nonNulls.length}');
    var errors = e.errors;
    for (var i = 0; i < errors.length; i++) {
      var error = errors[i];
      if (error != null) print("     : Caught error #$i: ${error}");
    }
    for (var entity in dir.listSync()) {
      log("Surviving file: ${entity.path}");
    }
  } catch (e) {
    exitCode = 1;
    log("Unexpected error: $e");
  } finally {
    log("Deleting temporary file directory");
    try {
      await dir.delete(recursive: true);
      log("Deleted directory");
      if (dir.existsSync()) print("   Not successfully?");
    } catch (e) {
      log("Unexpected error: $e");
    }
  }
  log("Cancelling subscription");

  openWrite.close();
  await (openRead.cancel());
  log("Subscription cancelled");
  await pipeDir.delete(recursive: true);
  log("Pipe directory deleted");
  return exitCode;
}

Future<(Directory, File)> createNamedPipe() async {
  final dir = await Directory.systemTemp.createTemp();
  var fifo = File('${dir.path}/fifo');
  await Process.run('mkfifo', [fifo.path]);
  print("CREATED FIFO: ${fifo.path}");
  return (dir, fifo);
}

Future<Directory> createTmpFiles(int count) async {
  log("Creating temporary files");
  final dir = await Directory.systemTemp.createTemp();
  log("Created temporary file directory: ${dir.path}");
  for (var i = 0; i < count; i++) {
    var file = File('${dir.path}/file${"$i".padLeft(4, "0")}');
    await file.create();
    log("Created file: ${file.path}");
  }
  log("Created temporary files");
  return dir;
}

Future<void> deleteFiles(Directory dir, int fileCount,
    [int timeout = 200]) async {
  var t0 = sw.elapsedMilliseconds;
  log("Deleting temporary files");
  List<Future<void>> futures = [];
  var dur = Duration(milliseconds: timeout);
  log("Listing temporary files");
  int i = 0;
  int completed = 0;
  await for (var entity in dir.list()) {
    if (entity is File) {
      var index = i++;
      log("Deleting #$index: ${entity.path}");
      var e0 = sw.elapsedMilliseconds - t0;
      if (e0 > timeout) log("Still listing files after $e0 ms");
      futures.add(
        entity.delete().then((v) {
          completed++;
          var e1 = sw.elapsedMilliseconds - t0;
          log("Deleted #$index: ${entity.path}${e1 > timeout ? " too late($e1)" : ""}");
          return v;
        }).timeout(dur, onTimeout: () {
          var e2 = sw.elapsedMilliseconds - t0;
          log("Timeout deleting file #$index of $fileCount: ${entity.path} after $e2 ms");
          var zone = Zone.current[#_ez] ??
              (identical(Zone.current, Zone.root) ? "Root" : "Unknown");
          throw TimeoutException(
              "in $zone: Deleting file #$index of $fileCount: ${entity.path}");
        })
          ..ignore(),
      );
    }
  }
  log("All deletes initiated, $completed of $i already complete.");
  await futures.wait;
  log("Done deleting temporary files");
}

from sdk.

cbenhagen avatar cbenhagen commented on August 15, 2024

@lrhn thanks for your analysis! Just wanted to note that replacing the .delete() with .length() shows the same symptoms.

Quoting @mraleph from Discord:

I think dart:io does wrong thing here - it most likely just tries to read this pipe using a blocking IO and exhausts the thread pool that is used for this. Instead it should check if the file you are opening is a pipe and do async IO instead

from sdk.

lrhn avatar lrhn commented on August 15, 2024

If lenght has the same problem, then it's not due to concurrent updates of the directory structure.
More curioser!

from sdk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.