Hi everyone,
I am one of participants.
In task1, we combined the transaction and user dataset into a huge matrix (number of bank * number of user in 2014, number of features) as our training dataset. The features we extracted are
(1) user gender,
(2) user age category,
(3) user income category,
(4) user wealth category in 16 month,
(5) whether user has credict card in 16 month,
(6) user location geo information x and y,
(7) user location category,
(8) number of pos visits in first half of 2014 per user,
(9) number of webshop visits in first half of 2014 per user,
(10) number of amount of spent low, mid, high in first half of 2014 per user,
(11) number of transaction in market category a to j in first half of 2014 per user,
(12) number of transaction in the morning, afternoon, evening in first half of 2014 per user,
(13) number of credit card use in first half of 2014 per user,
(14) number of debit card use in first half of 2014 per user,
(15) the distance between user location and branch location,
(16) the distance between user social center location and branch location
(the social center of each user is determined by
sum( number of transaction in "a" pos * ("a" pos geo x, "a" pos geo y) ),
(17) number of branch visit decided by neighbors (based on user location geo)
in first half of 2014 per user,
(18) number of branch visit decided by users near social center
in first half of 2014 per user,