Comments (14)
Also, there are some overlapped identities between facescrub and ms1m. I downloaded from the freebase the correspondence between MIDs and real names. Please check the attachment
mid_to_name.txt.zip
UPD: Aaron Eckhart has an identifier m.03t4cz. This person is present in both the test and the training sets. Obviously, there are other such persons.
UPD2: m.04wp3s:Sam Rockwell, m.014zfs:Bill Cosby, m.02h3tp:Patrick Swayze, etc - all these identities are both in training and test sets (I've just checked it manually, I believe that there is more than 50% of the intersection.)
from insightface.
Agree. But according to my test there are at least 67.5% overlap. I don't trust to any results that are based on celebrity datasets. The most reliable test is NIST FRVT test, which is free for all researchers.
from insightface.
We're doing such experiment and will be available in our paper soon, slightly worse I think(<0.1).
We have already removed 500+ identities from ms1m by checking the similarity between facescrub and ms1m. Please see src/data/dataset_merge.py if you want to know how we remove overlaps.
from insightface.
I just wrote a script that checks for matches between test persons (subset of facescrub that used in MegaFace challenge) and persons from the training set (your cleaned ms1m list). There are 54/80 persons that are both in training and test sets:
Stana_Katic m.0fd6sd
Farrah_Fawcett m.01j851
Sam_Rockwell m.04wp3s
Alec_Baldwin m.018ygt
Christopher_Reeve m.0jrny
James_Remar m.05mlqj
Brendan_Fraser m.0227tr
Brianna_Brown m.0gdvdh
Andrea_Bowen m.05dxl5
Tempestt_Bledsoe m.014yqb
Paul_Bettany m.01chc7
Robert_Redford m.0gs1_
Mark_Wahlberg m.0gy6z9
Sarah_Hyland m.0523pz4
Alley_Mills m.0d_3hq
Kit_Harington m.09v4hnq
Victoria_Justice m.07w71b
Robert_Duvall m.015c4g
Edie_Falco m.01dy7j
Peggy_McCay m.05j0x1
Jeremy_Irons m.016ywr
Rebecca_Budig m.03jtgb
Brad_Garrett m.01rcmg
Bill_Cosby m.014zfs
Christel_Khalil m.0719hb
Lindsay_Hartley m.04w9ky
Joanna_Kerns m.0403xb
Emile_Hirsch m.05mkhs
Christine_Lakin m.06wr68
Marilu_Henner m.02pzx7
James_Marsden m.042ly5
Justin_Timberlake m.0j1yf
Adam_Brody m.0214df
Patrick_Swayze m.02h3tp
John_Malkovich m.017r13
Melina_Kanakaredes m.02pbhg
Nadia_Bjorlin m.04vpr3
Ryan_Phillippe m.01ksr1
Fran_Drescher m.01s3kv
Norman_Reedus m.0bs6hr
Robert_Knepper m.07v7p6
Didi_Conn m.04tvm2
Bobbie_Eakes m.03s_t9
Heath_Ledger m.0237fw
Summer_Glau m.039g0_
Emily_Deschanel m.03vd_l
Orlando_Bloom m.09wj5
Daniel_Day-Lewis m.016yvw
Shia_LaBeouf m.04w391
Kimberlin_Brown m.03ff8f
Adrienne_Barbeau m.01z7nj
Dean_Cain m.02qjj7
Erin_Cummings m.063z0nr
Joaquin_Phoenix m.018db8
from insightface.
@azat-d I think it is also very difficult to find ALL overlaps by names matching.
from insightface.
@azat-d I have removed 500+ identities from MS1M by comparing with facescrub dataset, to test MegaFace. By reference, facescrub have only 530 identities in total. I believe our result is quite reliable.
from insightface.
Megaface test use only 80 identities from facescrub. And checked YOURS train list against those identities.
from insightface.
And I've found that 54/80 identities are both in test and in yours training set.
from insightface.
I'm talking about this https://pan.baidu.com/s/1eTn6O62 training set
from insightface.
Do you mean that there was additional cleaning of this list?
from insightface.
500+ identities were removed in my binary packed dataset, not this clean list. You can check it in our paper and there's about 0.3% performance drop(98.3% -> 98.0%)
You need to generate features for all 530 identities if you want to upload the result, 80 identities
is only required by set-1.
from insightface.
Ok, thank you!
from insightface.
So great to hear that the results about overlapping identities removing, thank you guys, I will also take a look at this then, may update if any new results here.
from insightface.
closing as this is well discussed here.
from insightface.
Related Issues (20)
- Inquiry on One-to-Many and Many-to-One Face Processing
- AttributeError: module 'numpy' has no attribute 'int'. HOT 1
- Can I use batch inference by specifying ArcFaceONNX config in FaceAnalysis class?
- Expose onnxruntime session options (intra_op_num_threads)
- [arcface_torch]关于多卡并行的问题
- insightface can't extract facial feature from anime characters well HOT 1
- FutureWarning: `rcond` parameter will change to the default ... when running example
- Is the WF42M-PFC-0.3 model available for download somewhere?
- ImportError: cannot import name 'ValidationInfo' from 'pydantic' HOT 2
- why ?
- Does anyone know how to plot the pose data from arcface?
- Comercial needs
- 置信度相差不大
- 置信度相差不大
- Commerical Use HOT 1
- Failed downloading url HOT 1
- FileNotFoundError: [Errno 2] No such file or directory HOT 2
- 提取到512特征如何通过绘制热力方式图到原图上呢 HOT 1
- The installation of insightface encountered a sub-process error, and the metadata for each version of pluggy from each source was corrupted, so it could not be installed correctly HOT 2
- using cuda synthetic face data in MXFaceDataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from insightface.