Comments (4)
Awesome, thanks so much!
from gill.
Thanks, we've added more details here in the README about the structure. It is the same structure as the original PartiPrompts dataset, but we add a column with human annotations indicating whether they preferred the retrieved or generated images.
is the raw pairwise preference data public (with both annotations)
We've uploaded the annotations for the realism/fidelity setting here.
Can we access the exact images that the human annotators saw
Unfortunately we will most likely be unable to release the exact images used because there are some prompts involving humans, and we are also unable to distribute CC3M images. The generated images can be reproduced by running the PartiPrompts through Stable Diffusion v1.5 and the retrieved images can be acquire by using CLIP ViT-L to retrieve the closest image in CC3M (details in Sec 3.3 of the paper).
Hope that helps!
from gill.
I see, thanks @kohjingyu that helps. For the generated images being as close as possible to the ones you'd have likely shown to the users, could you please provide the exact StableDiffusion configuration you used i.e. text guidance, random seed, number of diffusion steps etc.
Thanks again! :)
from gill.
I've uploaded the Stable Diffusion script here. Hope that helps!
from gill.
Related Issues (20)
- Clarification on precomputing the visual embeddings HOT 1
- How to get cc3m_embeddings HOT 1
- About the running log HOT 4
- Normalization of cc3m features HOT 1
- How could this affect the performance? HOT 10
- About error when running Precomputing Text Embeddings and Train HOT 2
- shape mismatch in the example notebook HOT 2
- [solved]
- why don't you use universal representation in one task?
- GILL Image Retrieval Code on VIST HOT 1
- Inference shape is not 8 HOT 1
- Visdial相关问题
- Error size mismatch when load decision model HOT 2
- RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
- param.grad is None !
- shape mismatch in the example "Multimodal Dialogue" HOT 1
- FID Evaluation on CC3M and VIST
- i try to dowmload cc3m using tools recommand by readme.md, but the number of picture can be download only 10% . is it normal?
- about [img] token and train data
- environment conflict
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gill.