Comments (5)
We only change the dataloader for different tasks. The network architecture, loss and optimizer can be reused without any modification. Please be careful about the loss masking, namely, only apply loss upon the answer part.
from llama-adapter.
We follow the original template of ScienceQA.
input1 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=True)
input2 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=False)
input1 = torch.tensor(self.tokenizer1.encode(input1, bos=True, eos=False), dtype=torch.int64)
input2 = torch.tensor(self.tokenizer1.encode(input2, bos=True, eos=True), dtype=torch.int64)
from llama-adapter.
282 def getitem(self, index):
283
284 if self.index[index] in self.visual_features_map.keys():
285 visual = self.visual_features[int(self.visual_features_map[self.index[index]])]
286 visual = torch.tensor(visual).float()
287 else:
288 visual = torch.zeros(100, 256).float()
289 ann = self.ann[self.index[index]]
290
291 question = get_question_text(ann)
292 context = get_context_text(ann, self.use_caption)
293 choice = get_choice_text(ann, self.options)
294 answer = get_answer(ann, self.options)
295 lecture = get_lecture_text(ann)
296 solution = get_solution_text(ann)
297 input1 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=True)
298 input2 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=False)
299 input1 = torch.tensor(self.tokenizer1.encode(input1, bos=True, eos=False), dtype=torch.int64)
300 input2 = torch.tensor(self.tokenizer1.encode(input2, bos=True, eos=True), dtype=torch.int64)
301 padding = self.max_words - input2.shape[0]
302 if padding > 0:
303 input2 = torch.cat((input2, torch.zeros(padding, dtype=torch.int64) - 1))
304 elif padding < 0:
305 input2 = input2[:self.max_words]
306 labels = copy.deepcopy(input2)
307 labels[:len(input1)] = -1
308 input2_mask = input2.ge(0)
309 label_mask = labels.ge(0)
310 input2[~input2_mask] = 0
311 labels[~label_mask] = 0
312 input2_mask = input2_mask.float()
313 label_mask = label_mask.float()
314 return input2, labels, input2_mask, visual
from llama-adapter.
282 def getitem(self, index): 283 284 if self.index[index] in self.visual_features_map.keys(): 285 visual = self.visual_features[int(self.visual_features_map[self.index[index]])] 286 visual = torch.tensor(visual).float() 287 else: 288 visual = torch.zeros(100, 256).float() 289 ann = self.ann[self.index[index]] 290 291 question = get_question_text(ann) 292 context = get_context_text(ann, self.use_caption) 293 choice = get_choice_text(ann, self.options) 294 answer = get_answer(ann, self.options) 295 lecture = get_lecture_text(ann) 296 solution = get_solution_text(ann) 297 input1 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=True) 298 input2 = create_one_example(self.prompt_format, question, context, choice, answer, lecture, solution, test_example=False) 299 input1 = torch.tensor(self.tokenizer1.encode(input1, bos=True, eos=False), dtype=torch.int64) 300 input2 = torch.tensor(self.tokenizer1.encode(input2, bos=True, eos=True), dtype=torch.int64) 301 padding = self.max_words - input2.shape[0] 302 if padding > 0: 303 input2 = torch.cat((input2, torch.zeros(padding, dtype=torch.int64) - 1)) 304 elif padding < 0: 305 input2 = input2[:self.max_words] 306 labels = copy.deepcopy(input2) 307 labels[:len(input1)] = -1 308 input2_mask = input2.ge(0) 309 label_mask = labels.ge(0) 310 input2[~input2_mask] = 0 311 labels[~label_mask] = 0 312 input2_mask = input2_mask.float() 313 label_mask = label_mask.float() 314 return input2, labels, input2_mask, visual
Thank you for your reply! Can the loss function on the ScienceQA dataset also be applied directly to the current V1 version?
from llama-adapter.
Hi @Gary3410 were you successful in replicating the results? I am getting accuracy in the 60s range.
from llama-adapter.
Related Issues (20)
- Unable to produce the result between LLaMA-Adapter V1 and Alpaca HOT 1
- question about Pretrained LLAMA applicable to Llama_adapter model. thanks HOT 1
- I don't know which data to use to reproduce the model llama-adapter-multimodal-v2.
- Does storage space in the paper mean the capacity of checkpoint file? HOT 2
- Inquiry on Loading LLaMa-2 Model Parameters HOT 1
- how to set llama adapter max_seq_len = 4096
- [LLaMA Adapter V2] Evaluation on multiple choice questions. HOT 1
- AssertionError: Loading a checkpoint for MP=0 but world size is 1 HOT 2
- Don't find save path"ADAPTER_PATH" HOT 1
- Getting error "AF_UNIX path too long"
- Loss is nan, stopping training, while trying to reproduce alpaca_finetuning_v1 results. HOT 1
- Simple question about llama adapter v1 transformer forward function
- imagebind_LLM中的get_chinese_llama.py文件丢失,可以补充一下吗? HOT 1
- Getting weird output for multimodal 7B adapter HOT 3
- Assertation Error start_pos- AdapterV2 Multimodal
- The meaning of C_loss and M_loss HOT 1
- what is the dataset during pretraining llama_adapter_v2_multimodal7b?
- RuntimeError: CUDA out of memory
- RuntimeError: [enforce fail at CPUAllocator.cpp:68] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 90177536 bytes. Error code 12 (Cannot allocate memor y)
- created a model on colab but cannot load for inference
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-adapter.