Giter Club home page Giter Club logo

Comments (16)

palewire avatar palewire commented on June 12, 2024 1

@zstumgoren, yes, the Wisconsin system is based on three factors: the revised flag, the sort order and the company name. It passes my eyeball test.

I'm considering having it throw a hard error rather than a log when no ancestor filing is found. That might prevent some new data entry practice by the state from introducing silent bugs.

The policy is limited to WI thus far. The code in the current PR wouldn't affect anywhere else.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

If the state's raw data has an amendment column, could we use that to strike certain records?

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

Another thought, would a true/false boolean called something like is_amendment be a reasonable way to start flagging these?

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

Take a look at #68, @chriszs

from warn-transformer.

chriszs avatar chriszs commented on June 12, 2024

Yeah, good start! I think what we really want to know is if something was amended, but obviously harder to flag.

from warn-transformer.

chriszs avatar chriszs commented on June 12, 2024

In Wisconsin's case, revisions seem to always appear below the original, could be used to exclude records superseded by amendments.

Screen Shot 2022-02-21 at 8 17 25 PM

In this case, we want to keep revision 2, discard revision 1 and the original.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

Hmm. There must be some way to suss this out.

from warn-transformer.

chriszs avatar chriszs commented on June 12, 2024

I think it's going to be a state-by-state fight, because the date and amount can change and I suspect in rare cases the location can change, too. It's pretty clear to me what do in WI, but other states may not be so straightforward.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

Gotcha. I think a first step is to go state by state and try to mark amendments. I'm going to try that first, and then we can figure out how to handle them next.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

I went through all of the CSV files. I only found one other state — Iowa — that had a clear and obvious indicator of an amendment. That was added here #69

Have you see other states where you know there to be amendments, @chriszs?

I wonder if @zstumgoren has some wisdom for us here, as well.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

I've drafted an addition to our transformer system for excluding the ancestors of amendments. You can find that here. It encodes what are my latest opinions on WI and IA. For WI, it strikes the previous record in the sheet if the name is the same. For IA, it does nothing, because I can't spot any duplication introduced by the amendments.

from warn-transformer.

chriszs avatar chriszs commented on June 12, 2024

Yes, I think. IL comes to mind because I did that recently. It's at the bottom of the spreadsheet there. I'd have to go back and look at other states to see how they handle. Some just add like "UPDATE" to it and then update the record, so that may not require special handling.

from warn-transformer.

zstumgoren avatar zstumgoren commented on June 12, 2024

Hey @palewire I hadn't noticed amendments in other states, but I'd defer to you and @chriszs at this point on where else this may apply. I do like the strategy of giving precedence to the most recently filed amendment, which feels very much in line with how FEC filings work. Alas, sounds like WARN-land we need to devise a heuristic for identifying related records in an "amendment chain". @palewire Am I reading your PR correctly that the v1 strategy is based primarily on identification of a repeated company name?

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

@chriszs, when you have a minute can you point out to me what you see in the IL data?

In the meantime, I am going to merge this PR.

from warn-transformer.

chriszs avatar chriszs commented on June 12, 2024

So this may be moot if the state's live data portal provides a better alternative to the archive page we're using (I've put this in a ticket), but archive has these supplemental notices at bottom of sheets:

Screen Shot 2022-02-22 at 3 40 00 PM

Employee number is additional, so may not inflate that, but unclear whether handling this well in every case.

from warn-transformer.

palewire avatar palewire commented on June 12, 2024

I think we got this handled. Correct me if we dont.

from warn-transformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.