There is an nan in party list which causes an IndexError when calculating the probablility based on party history. It seems that when you're iterating over your list of parties for prediction based on party's history in pervious polls you have a nan
at some point in your party list:
C:\ProgramData\Anaconda3\envs\homework3\python.exe C:/Users/antoi/Documents/Programming/GE2018/main.py
Starting..
...
0%| | 0/270 [00:00<?, ?it/s]
...
72%|███████▏ | 194/270 [00:41<00:16, 4.73it/s]
Traceback (most recent call last):
File "C:/Users/antoi/Documents/Programming/GE2018/main.py", line 47, in <module>
data1 = compare_methods("L2-EX")
File "C:\Users\antoi\Documents\Programming\GE2018\comparison.py", line 118, in compare_methods
party_wise_result, seat_wise_result = final_model(paras[:12])
File "C:\Users\antoi\Documents\Programming\GE2018\model.py", line 55, in final_model
candidate_prob += para6*np.array(predict_partyHistory(current_constituency_data))
File "C:\Users\antoi\Documents\Programming\GE2018\predict.py", line 124, in predict_partyHistory
votes = party_prob[0]
IndexError: list index out of range
Process finished with exit code 1
Indeed, contrarily to all others, at some iteration you have an nan
from in your list of parties.
list_parties: ['National Party', 'MMA', 'Allah-o-Akbar Tehreek', 'IND', 'PML-N', nan, 'IND', 'APML', 'PPPP', 'TLP', 'PTI']
And the related party_prob
is empty when you try to get it from party nan
:
for party in list_parties:
party_prob = df_probability[df_probability["Party"] == party]["Probability"].tolist()
# if party is in gallup survey or it has zero rating
is_in_history = (df_probability[df_probability["Party"].isin([party])].index).tolist()
# if party is in gallup (not not empty list is false)
if( not not is_in_history ):
votes = party_prob[0]
Indeed, the results are:
party: nan
df_probability[df_probability["Party"] == party]:
Empty DataFrame
Columns: [Party, Probability, Unnamed: 2, Unnamed: 3, Unnamed: 4, Unnamed: 5, Unnamed: 6]
Index: []
party_prob: []
My attempt
If I try to filter the parties to get rid out of these nan I create a ValueError. I tried:
# find probability of winning for each candidate from gallup survey
candidate_prob = []
list_parties = [x for x in list_parties if str(x) != 'nan']
for party in list_parties:
But got:
C:\ProgramData\Anaconda3\envs\homework3\python.exe C:/Users/antoi/Documents/Programming/GE2018/main.py
Starting..
....
0%| | 1/270 [00:00<00:47, 5.72it/s]C:\Users\antoi\Documents\Programming\GE2018\predict.py:135: RuntimeWarning: divide by zero encountered in double_scalars
prob_extra = 0.5*float(remaining_prob/remaining_candidates)
C:\Users\antoi\Documents\Programming\GE2018\predict.py:174: RuntimeWarning: divide by zero encountered in double_scalars
prob_extra = 0.5*float(remaining_prob/remaining_candidates)
7%|▋ | 19/270 [00:03<00:46, 5.45it/s]C:\Users\antoi\Documents\Programming\GE2018\predict.py:57: RuntimeWarning: divide by zero encountered in double_scalars
prob_extra = 0.5*float(remaining_prob/remaining_candidates)
72%|███████▏ | 194/270 [00:37<00:14, 5.12it/s]
Traceback (most recent call last):
File "C:/Users/antoi/Documents/Programming/GE2018/main.py", line 47, in <module>
data1 = compare_methods("L2-EX")
File "C:\Users\antoi\Documents\Programming\GE2018\comparison.py", line 118, in compare_methods
party_wise_result, seat_wise_result = final_model(paras[:12])
File "C:\Users\antoi\Documents\Programming\GE2018\model.py", line 55, in final_model
candidate_prob += para6*np.array(predict_partyHistory(current_constituency_data))
ValueError: operands could not be broadcast together with shapes (11,) (10,) (11,)