Giter Club home page Giter Club logo

oscar's People

Contributors

pdlan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

oscar's Issues

No address associated with hostname

I run following commands to build a docker image for the environment:
sudo docker build -t oscar:latest .

There is an error:
捕获

What should I do?

Error on poj-clone-classification task

Hi i am trying to replicate the clone classification task with the poj104 dataset, however I am having an error when I execute the process shell script inside process-poj-clone-detection folder

The script fails to run 5_json_to_rawtext.py as files inst_dict.txt and state_dict.txt meant to be located in ../data-bin/pretrain are not created .

another error while training

Hi I'm doing pretrain with given scripts and faced below error while executing ./model/scripts/pretrain.sh.

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/oscar/model/train.py", line 309, in distributed_main
    main(args, init_distributed=True)
  File "/oscar/model/train.py", line 51, in main
    model = task.build_model(args)
  File "/oscar/model/fairseq/tasks/ir_masked_lm.py", line 210, in build_model
    model = models.build_model(args, self)
  File "/oscar/model/fairseq/models/__init__.py", line 45, in build_model
    return ARCH_MODEL_REGISTRY[args.arch].build_model(args, task)
  File "/oscar/model/fairseq/models/irbert/model.py", line 86, in build_model
    encoder = IRBertEncoder(args, task.instruction_dictionary, task.state_dictionary)
  File "/oscar/model/fairseq/models/irbert/model.py", line 263, in __init__
    copy_weights(self.sentence_encoder, self.sentence_encoder_momentum)
NameError: name 'copy_weights' is not defined

copy_weights function seems missing in this repository.

Docker error.

Hi!I tried to fix it on a servers without internet.So I modified the dockerfile and solve some probelms but this one too hard for me.Could you help me? Thank You
Current default time zone: 'Etc/UTC'
Local time is now: Sun Nov 20 22:20:26 UTC 2022.
Universal Time is now: Sun Nov 20 22:20:26 UTC 2022.
Run 'dpkg-reconfigure tzdata' if you wish to change it.

Setting up systemd-sysv (245.4-4ubuntu3.18) ...
Setting up libelf1:amd64 (0.176-1.1build1) ...
Setting up libicu66:amd64 (66.1-2ubuntu2.1) ...
Setting up libglib2.0-0:amd64 (2.64.6-1ubuntu20.04.4) ...
Setting up libtinfo6:amd64 (6.2-0ubuntu2) ...
Setting up libproxy1v5:amd64 (0.4.15-10ubuntu1.2) ...
Setting up glib-networking-services (2.64.2-1ubuntu0.1) ...
Setting up distro-info-data (0.43ubuntu1.11) ...
Setting up cmake-data (3.16.3-1ubuntu1) ...
Setting up libstemmer0d:amd64 (0+svn585-2) ...
Setting up librtmp1:amd64 (2.4+20151223.gitfa8646d.1-2build1) ...
Setting up libpackagekit-glib2-18:amd64 (1.1.13-2ubuntu1.1) ...
Setting up libbsd0:amd64 (0.10.0-1) ...
Setting up libkrb5support0:amd64 (1.17-6ubuntu4.1) ...
Setting up ucf (3.0038+nmu1) ...
Setting up libgirepository-1.0-1:amd64 (1.64.1-1
ubuntu20.04.1) ...
Setting up libxml2:amd64 (2.9.10+dfsg-5ubuntu0.20.04.4) ...
Setting up libmagic-mgc (1:5.38-4) ...
Setting up uuid-runtime (2.34-0.1ubuntu9.3) ...
Adding group uuidd' (GID 105) ... Done. Warning: The home dir /run/uuidd you specified can't be accessed: No such file or directory Adding system user uuidd' (UID 104) ...
Adding new user uuidd' (UID 104) with group uuidd' ...
Not creating home directory `/run/uuidd'.
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of start.
Created symlink /etc/systemd/system/sockets.target.wants/uuidd.socket → /lib/systemd/system/uuidd.socket.
Setting up libmagic1:amd64 (1:5.38-4) ...
Setting up librhash0:amd64 (1.3.9-1) ...
Setting up libcbor0.6:amd64 (0.6.0-0ubuntu1) ...
Setting up libyaml-0-2:amd64 (0.2.2-1) ...
Setting up gir1.2-glib-2.0:amd64 (1.64.1-1ubuntu20.04.1) ...
Setting up libglib2.0-data (2.64.6-1
ubuntu20.04.4) ...
Setting up krb5-locales (1.17-6ubuntu4.1) ...
Setting up publicsuffix (20200303.0012-1) ...
Setting up libfido2-1:amd64 (1.3.1-1ubuntu2) ...
Setting up wget (1.20.3-1ubuntu2) ...
Setting up libdconf1:amd64 (0.36.0-1) ...
Setting up libcrypt-dev:amd64 (1:4.4.10-10ubuntu4) ...
Setting up dmsetup (2:1.02.167-1ubuntu1) ...
Setting up shared-mime-info (1.15-1) ...
Setting up gir1.2-packagekitglib-1.0 (1.1.13-2ubuntu1.1) ...
Setting up libc-dev-bin (2.31-0ubuntu9.9) ...
Setting up libxdmcp6:amd64 (1:1.1.3-0ubuntu1) ...
Setting up libkeyutils1:amd64 (1.6-6ubuntu1.1) ...
Setting up libglib2.0-bin (2.64.6-1ubuntu20.04.4) ...
Setting up libc6-dev:amd64 (2.31-0ubuntu9.9) ...
Setting up xdg-user-dirs (0.17-2ubuntu1) ...
Setting up libx11-data (2:1.6.9-2ubuntu1.2) ...
Setting up libxau6:amd64 (1:1.0.9-0ubuntu1) ...
Setting up libmpdec2:amd64 (2.4.2-3) ...
Setting up libpolkit-gobject-1-0:amd64 (0.105-26ubuntu1.3) ...
Setting up libdbus-1-3:amd64 (1.12.16-2ubuntu2.3) ...
Setting up libreadline8:amd64 (8.0-4) ...
Setting up libjsoncpp1:amd64 (1.7.4-3.1ubuntu2) ...
Setting up libedit2:amd64 (3.1-20191231-1) ...
Setting up libk5crypto3:amd64 (1.17-6ubuntu4.1) ...
Setting up less (551-1ubuntu0.1) ...
Setting up libgstreamer1.0-0:amd64 (1.16.3-0ubuntu1.1) ...
Setcap worked! gst-ptp-helper is not suid!
Setting up libarchive13:amd64 (3.4.0-2ubuntu1.2) ...
Setting up libpolkit-agent-1-0:amd64 (0.105-26ubuntu1.3) ...
Setting up libncursesw6:amd64 (6.2-0ubuntu2) ...
Setting up file (1:5.38-4) ...
Setting up libkrb5-3:amd64 (1.17-6ubuntu4.1) ...
Setting up dbus (1.12.16-2ubuntu2.3) ...
Setting up libxcb1:amd64 (1.14-2) ...
Setting up libpython3.8-stdlib:amd64 (3.8.10-0ubuntu1
20.04.5) ...
Setting up libpython3-stdlib:amd64 (3.8.2-0ubuntu2) ...
Setting up libpam-systemd:amd64 (245.4-4ubuntu3.18) ...
Setting up policykit-1 (0.105-26ubuntu1.3) ...
Setting up python3.8 (3.8.10-0ubuntu1~20.04.5) ...
Setting up libx11-6:amd64 (2:1.6.9-2ubuntu1.2) ...
Setting up libxmuu1:amd64 (2:1.1.3-0ubuntu1) ...
Setting up dbus-user-session (1.12.16-2ubuntu2.3) ...
Setting up libgssapi-krb5-2:amd64 (1.17-6ubuntu4.1) ...
Setting up libssh-4:amd64 (0.9.3-2ubuntu2.2) ...
Setting up openssh-client (1:8.2p1-4ubuntu0.5) ...
Setting up libxext6:amd64 (2:1.3.4-0ubuntu1) ...
Setting up python3 (3.8.2-0ubuntu2) ...
Setting up dconf-service (0.36.0-1) ...
Setting up libcurl3-gnutls:amd64 (7.68.0-1ubuntu2.14) ...
Setting up python3-idna (2.8-1) ...
Setting up libcurl4:amd64 (7.68.0-1ubuntu2.14) ...
Setting up python3-six (1.14.0-2) ...
Setting up python3-certifi (2019.11.28-1) ...
Setting up python3-pkg-resources (45.2.0-1) ...
Setting up python3-gi (3.36.0-1) ...
Setting up lsb-release (11.1.0ubuntu2) ...
Setting up xauth (1:1.1-0ubuntu1) ...
Setting up python3-chardet (3.0.4-4build1) ...
Setting up python3-urllib3 (1.25.8-2ubuntu0.1) ...
Setting up cmake (3.16.3-1ubuntu1) ...
Setting up dconf-gsettings-backend:amd64 (0.36.0-1) ...
Setting up git (1:2.25.1-1ubuntu3.6) ...
Setting up python3-distro-info (0.23ubuntu1) ...
Setting up python3-apt (2.0.0ubuntu0.20.04.8) ...
Setting up python3-dbus (1.2.16-1build1) ...
Setting up gsettings-desktop-schemas (3.36.0-1ubuntu1) ...
Setting up glib-networking:amd64 (2.64.2-1ubuntu0.1) ...
Setting up unattended-upgrades (2.3ubuntu0.3) ...

Creating config file /etc/apt/apt.conf.d/20auto-upgrades with new version

Creating config file /etc/apt/apt.conf.d/50unattended-upgrades with new version
Created symlink /etc/systemd/system/multi-user.target.wants/unattended-upgrades.service → /lib/systemd/system/unattended-upgrades.service.
Setting up python3-requests (2.22.0-2ubuntu1) ...
Setting up python3-software-properties (0.99.9.8) ...
Setting up networkd-dispatcher (2.1-2~ubuntu20.04.3) ...
Created symlink /etc/systemd/system/multi-user.target.wants/networkd-dispatcher.service → /lib/systemd/system/networkd-dispatcher.service.
Setting up python3-requests-unixsocket (0.2.0-2) ...
Setting up libsoup2.4-1:amd64 (2.70.0-1) ...
Setting up libappstream4:amd64 (0.12.10-2) ...
Setting up packagekit (1.1.13-2ubuntu1.1) ...
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of force-reload.
Failed to open connection to "system" message bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
Created symlink /etc/systemd/user/sockets.target.wants/pk-debconf-helper.socket → /usr/lib/systemd/user/pk-debconf-helper.socket.
Setting up software-properties-common (0.99.9.8) ...
Setting up packagekit-tools (1.1.13-2ubuntu1.1) ...
Processing triggers for systemd (245.4-4ubuntu3.18) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
Processing triggers for dbus (1.12.16-2ubuntu2.3) ...

The command '/bin/sh -c apt-get update && apt-get install -y git cmake uuid-runtime lsb-release wget software-properties-common && wget --quiet https://golang.org/dl/go1.16.6.linux-amd64.tar.gz -O ~/go.tar.gz && tar xzf ~/go.tar.gz -C /opt/ && ln -s /opt/go/bin/go /usr/local/bin/go && rm ~/go.tar.gz' returned a non-zero code: 4

Here is the new Dockerfile.
The main modification is at env.

ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH /opt/conda/bin:$PATH

ENV DEBIAN_FRONTEND noninteractive
env http_proxy "http://59.69.106.68:808"
env https_proxy "http://59.69.106.68:808"
env ftp_proxy "http://59.69.106.68:808"
ADD sources.list /etc/apt

RUN apt-get update &&
apt-get install -y git cmake uuid-runtime lsb-release wget software-properties-common &&
wget --quiet https://golang.org/dl/go1.16.6.linux-amd64.tar.gz -O ~/go.tar.gz &&
tar xzf ~/go.tar.gz -C /opt/ &&
ln -s /opt/go/bin/go /usr/local/bin/go &&
rm ~/go.tar.gz

Error while pretraining

Hi! Thanks for open-sourcing your great work.
I'd like to use OSCAR for embedding binary, but I couldn't find the pre-trained BERT model so I was pre-training it by myself with provided scripts.
So, I executed OSCAR/process-pretrain-data/process.sh in the given docker and faced the below error msg.
It seems the irexp_transformer_sentence_encoder file is missing under fairseq/modules directory.
Could you help to resolve this? Thanks!

Traceback (most recent call last):
  File "preprocess.py", line 13, in <module>
    from fairseq import options, tasks, utils
  File "/oscar/model/fairseq/__init__.py", line 9, in <module>
    import fairseq.criterions  # noqa
  File "/oscar/model/fairseq/criterions/__init__.py", line 24, in <module>
    importlib.import_module('fairseq.criterions.' + module)
  File "/opt/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/oscar/model/fairseq/criterions/sentence_ranking.py", line 11, in <module>
    from fairseq import utils
  File "/oscar/model/fairseq/utils.py", line 20, in <module>
    from fairseq.modules import gelu, gelu_accurate
  File "/oscar/model/fairseq/modules/__init__.py", line 28, in <module>
    from .irexp_transformer_sentence_encoder import IRTransformerSentenceEncoderNoMiddleLayers
ModuleNotFoundError: No module named 'fairseq.modules.irexp_transformer_sentence_encoder'

Errors in retrain for bindiff

Hi,

Thanks for your excellent work. But I encountered an issue when running

./scripts/bindiff.sh

training environment

torch                              1.10.0+cu111
torchaudio                         0.10.0+cu111
torchvision                        0.11.0+cu111
GPU: RTX 2080ti
RAM: 11GB
  1. no argument "use_pooling" in @register_criterion('poj_similarity')
@register_criterion('poj_similarity')
class PojSimilarityLoss(FairseqCriterion):

    def __init__(self, args, task):
        super().__init__(args, task)
        self.inst_padding_idx = task.instruction_dictionary.pad()
        self.state_padding_idx = task.state_dictionary.pad()
        self.task = task
        self.args = args

    def forward(self, model, sample, reduce=True, train=True):
        no_state = self.args.no_state
        no_pce = self.args.no_pce
        pooling = self.args.use_pooling
        output = model(**sample['net_input'], masked_tokens=None, features_only=True, moco_head=False,
            moco_head_only_proj=False, lm_head=False, classification_head_name=None,
            has_state=not no_state, has_pce=not no_pce, pooling_instruction=pooling)

after changing this to

#pooling = self.args.use_pooling

pooling = self.args.no_pooling

got another error:
2. multiple values for keyword "has_pce"

  File "/mnt/g/Projects/OSCAR/model/fairseq/models/irbert/model.py", line 92, in forward
    x, extra = self.decoder(src, features_only, return_all_hiddens, moco_head=moco_head, has_state=has_state,
TypeError: IRBertEncoder object got multiple values for keyword argument 'has_pce

after removing this keyword, got another error:
3. got multiple values for keyword argument 'pooling_instruction'

  File "/mnt/g/Projects/OSCAR/model/fairseq/models/irbert/model.py", line 92, in forward
    x, extra = self.decoder(src, features_only, return_all_hiddens, moco_head=moco_head, has_state=has_state,
TypeError: IRBertEncoder object got multiple values for keyword argument 'pooling_instruction'

after removing this, got another error too:

        add_(Tensor other, *, Number alpha) (Triggered internally at  ../torch/csrc/utils/python_arg_parser.cpp:1050.)
  exp_avg.mul_(beta1).add_(1 - beta1, grad)
Traceback (most recent call last):
  File "train.py", line 356, in <module>
    cli_main()
  File "train.py", line 321, in cli_main
    main(args)
  File "train.py", line 95, in main
    train(args, trainer, task, epoch_itr)
  File "train.py", line 139, in train
    log_output = trainer.train_step(samples)
  File "/OSCAR/model/fairseq/trainer.py", line 346, in train_step
    raise e
  File "/Projects/OSCAR/model/fairseq/trainer.py", line 309, in train_step
    loss, sample_size, logging_output = self.task.train_step(
  File "/OSCAR/model/fairseq/tasks/fairseq_task.py", line 248, in train_step
    optimizer.backward(loss)
  File "/OSCAR/model/fairseq/optim/fp16_optimizer.py", line 103, in backward
    loss.backward()
  File "/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/lib/python3.8/site-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

I don't know how to fix this. Can you tell me what did I miss? Thanks for your time in advance.

Trouble with make in the Docker image provided

Hi,

I built the dependencies for the docker image that you provided, but when I run make in /oscar/bin, I get:

cd ../irlexer && go build && cp irlexer ../bin
main.go:14:2: cannot find package "github.com/ianlancetaylor/demangle" in any of:
	/usr/lib/go-1.10/src/github.com/ianlancetaylor/demangle (from $GOROOT)
	/root/go/src/github.com/ianlancetaylor/demangle (from $GOPATH)
main.go:15:2: cannot find package "github.com/llir/ll" in any of:
	/usr/lib/go-1.10/src/github.com/llir/ll (from $GOROOT)
	/root/go/src/github.com/llir/ll (from $GOPATH)
Makefile:5: recipe for target 'irlexer' failed
make: *** [irlexer] Error 1

Best,
Jesse

Could you publish pre-trained models?

Hi. I was trying to train the models with given scripts and dataset but it took more time than I expected.
So, if you don't mind, could you share your pre-trained models? Probably, I do not have enough GPU to achieve reasonable training time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.