Giter Club home page Giter Club logo

Comments (14)

pkumza avatar pkumza commented on August 23, 2024 1

All right, issue reopened.
╮(╯▽╰)╭

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

One more candidate: AndroidProxySetter uses android-proxy which is not detected.

If you could let me know (or include with the README) how to obtain the values for dn, bh, btc and btn, I (and others) could try to add them to data/tgst5.dat. If additional steps are required for other files (e.g. data/new_dict.dat or permission/tagged_dict.txt), it would be nice to know as well.

from libradar.

pkumza avatar pkumza commented on August 23, 2024

Sorry for the late reply, I was very busy this days.
new_dict.dat is the dictionary for API.
the calculation is simple, stupid but easy to use.

for each api:
    b_hash = (b_hash + API_ID * API_NUM) % 999983

999983 is a large prime number.
LibRadar use (b_hash, b_total_num, b_total_call) as an identifier. I am not sure that is right in mathematics proof but it works in most cases.

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

Sorry for the late reply, I was very busy this days.

Yupp, thought so. We're all having more than one task at hand, so no worries it it takes a few days (a short note when estimates say it might take longer is welcome anytime – but I never expect immediate response for things that are not urgent. It's a hobby – and none of my recorded issues was about a "show breaker bug" ;)

Not being deeply involved with the technical details behind the library definitions, I unfortunately didn't understand your pointers here. Is there a short step-by-step instruction? Something along the lines of:

  1. run apktool d foo.apk
  2. cd to the directory the supposed lib is found in (for com/some/lib that's the directory where com resides)
  3. run foo com/some/lib to calculate bar
  4. ...

I take it in your code snippet, b_hash is what becomes bh. But it's unclear to me where API_ID and API_NUM come from, how b_hash must be initialized before the loop – and what happened to dn, btc and btn (or rather how to obtain their values).

Of course, I could always report my findings on "app uses lib " and have you do the work. But sometimes, I'd rather have that finding listed with the app before I forget what I need to rescan :)

LibRadar use (b_hash, b_total_num, b_total_call) as an identifier.

Ah yes, that's what bh, btn and btc stand for :) Still would need to know how to find those values, given a single APK using such an "unidentified library" :)

from libradar.

pkumza avatar pkumza commented on August 23, 2024

All right.
Here's the specific introduction and I want to make it simple but clear.

  1. run apktool d foo.apk
  2. cd to the directory the supposed lib is found in (for com/some/lib that's the directory where com resides)
  3. As a said before, new_dict.dat is the mapping table for APIs. read new_dict.dat into a hash map.
  4. Search every corner for the API appeared and get a dict for every API. {0:2, 1:4,3:1} This dictionary means API0 appears twice, API1 appears four times, API2 never appears and API3 appears once in com/some/lib. In this case, btn is 3 because three API types appears and btc is 7 as 2+4+1=7.
  5. b_hash is initiated as 0. Use the formula for each api: b_hash = b_hash + API_ID * API_NUM) % 999983 to calculate b_hash.

This is the way to calculate bh btc btn. dn stands for repetitions, so it is not calculated in this stage as we did't know if a sub-package is a library. I've got a lot of (bh, btc, btn) tuples, after that, we could cluster them into groups. dn stands for the size of a group.

if you want to add new library into the database, that means you've make sure that com/som/lib is a library. In this case, we don't need to get dn, just put bh, btc, btn, the library name and other information into database.

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

Thanks! But sorry if I might sound dumb: I get it as far as to step 3 (which is, if I uderstand correctly, what's the first part of permission/tag_dict.py. But I'm lost with step 4: I assume "every corner" applies to the unpacked foo.apk. "for every API" applies to what? To the hash map? And while in step 5 I now have the base for b_hash, I'm still confused concerning API_ID * API_NUM.

As I doubt you've been doing those steps manually for the many libraries already listed: Don't you have some script you've used for that, which I'd run at step 3 and it does 3-5 and then spits out the line to be added (to new_dict.dat I assume)? And in the end, don't I have to run permission/tag_dict.py to tag the new entries? And then, how to get them to data/tgst5.dat?

from libradar.

pkumza avatar pkumza commented on August 23, 2024

Search the byte code in smali file for some strings that matches the API.
For example, you found a string "Landroid/widget/OverScroller;->getCurrVelocity(" in a smali file and this string matches
{"key": "Landroid/widget/OverScroller;->getCurrVelocity(", "value": 12}
in new_dict.dat, this means we need add 1 on API_ID-12.
If this is the first time to match a string "Landroid/widget/OverScroller;->getCurrVelocity(", the dict in step 4 should be {12:1}.
If we found a string "Landroid/view/ViewGroup;->onKeyDown(" later, the dict in step 4 should be {12:1, 258:1}.
If we found string "Landroid/widget/OverScroller;->getCurrVelocity(" again, the dict in step 4 should be {12:2, 258:1}.
After the whole package were scanned, we got a dict.
For dict {12:2, 258:1}, b_hash should equal to (0+12*2+258*1)%999983

PS:
There's no doubt that I did those steps with scripts but that was research things. Code were like patches and patches. Scripts also need to have access to database, which makes the scripts very hard to use. As adding a new library does not need many steps like clustering, there's no need to use those scripts in chaos.
In fact, I got many API candidates that some of them were not actual Android API. So I deleted wrong ones and calculate hash number again and again.

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

Woah. That would mean scanning everything manually, transmitting findings manually by copy-pasting (error prone!), calculating manually (hoping to not having missed an entry)... I'd really like to add my findings – but sorry, that's much too time consuming – especially since results had to be checked multiple times and still leaving doubt one got it right. Some script would be highly appreciated here.

Also a bit unclear is which objects/lines should be counted. E.g. for the proxy example, in Smali I find a lot of strings like

sget-object v0, Lbe/shouldit/proxy/lib/reflection/android/ProxySetting;->NONE:Lbe/shouldit/proxy/lib/reflection/android/ProxySetting;

No opening parenthesis – so not to be matched? But

invoke-virtual {v0, v1}, Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyHost(Ljava/lang/String;)V

would be a valid candidate? So to find all possible candidates for my library "pn": "be/shouldit/proxy/lib", I'd cd into the app's Smali directory (here: tk.elevenk.proxysetter_0.2/smali/tk and run

grep -hRE "Lbe/shouldit/proxy/lib.+;-.+\(" *

resulting (in my case) in

invoke-virtual {v0, v1}, Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyHost(Ljava/lang/String;)V
invoke-virtual {v0, v1}, Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyPort(Ljava/lang/Integer;)V
invoke-virtual {v0, v1}, Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyExclusionString(Ljava/lang/String;)V
invoke-virtual {v0, v1}, Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxySetting(Lbe/shouldit/proxy/lib/reflection/android/ProxySetting;)V
…

Then I'd need to strip off everything following the opening parenthesis plus everything before the Lbe/ and sorting the output, so make the command

grep -hRE "Lbe/shouldit/proxy/lib.+;-.+\(" * |awk -F "(" '{print $1 "("}' |awk -F "}," '{print $2}' | sort

Resulting lines now look like

Lbe/shouldit/proxy/lib/APL;->disableWifi(
Lbe/shouldit/proxy/lib/APL;->enableWifi(
Lbe/shouldit/proxy/lib/APL;->enableWifi(
Lbe/shouldit/proxy/lib/APL;->getConfiguredNetwork(
Lbe/shouldit/proxy/lib/APL;->getConfiguredNetwork(
Lbe/shouldit/proxy/lib/APL;->getConfiguredNetworks(
Lbe/shouldit/proxy/lib/APL;->getWiFiAPConfiguration(
Lbe/shouldit/proxy/lib/APL;->getWiFiAPConfiguration(
Lbe/shouldit/proxy/lib/APL;->getWifiManager(
Lbe/shouldit/proxy/lib/APL;->getWifiManager(
Lbe/shouldit/proxy/lib/APL;->setup(
Lbe/shouldit/proxy/lib/APL;->writeWifiAPConfig(
Lbe/shouldit/proxy/lib/enums/SecurityType;->equals(
Lbe/shouldit/proxy/lib/enums/SecurityType;->equals(
Lbe/shouldit/proxy/lib/enums/SecurityType;->name(
Lbe/shouldit/proxy/lib/enums/SecurityType;->toString(
Lbe/shouldit/proxy/lib/reflection/android/ProxySetting;->equals(
Lbe/shouldit/proxy/lib/reflection/android/ProxySetting;->equals(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getProxyExclusionList(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getProxyExclusionList(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getProxyHost(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getProxyPort(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getProxySetting(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getSecurityType(
Lbe/shouldit/proxy/lib/WiFiApConfig;->getSSID(
Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyExclusionString(
Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyHost(
Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxyPort(
Lbe/shouldit/proxy/lib/WiFiApConfig;->setProxySetting(

which fits your description and already makes counting easier. I've already checked with new_dict.dat that this library isn't yet there. So again I'm stuck as where to get the value from: take the last line of new_dict.dat and increase its value by 1? So after above first 3 lines, my dict would look like {99850:1, 99851:2} – and what then? What needs to be added to new_dict.dat (most likely "key": "Lbe/shouldit/proxy/lib/APL;->disableWifi(" "value": 99850} etc), where does the value behind the colon go to, and how does the entire match come into tgst5.dat?

Maybe it's easier if I instead submit the results of the last mentioned command (with some describing details), and you continue from there (as you've got routine to do that)? Once it is in tgst5.dat, I could proceed adding the missing details (as I did with all the other libs).

from libradar.

pkumza avatar pkumza commented on August 23, 2024

Yes.
I did this in Python script and used regex too, which is somewhat automatic.
However adding a new library is still difficult.
I would appreciate that if you could give me a list of libraries that do not appear in my database.

As "Lbe/shouldit/proxy/lib/APL;->disableWifi(" is a method that the code used, it does not mean that this string is a System API for Android. In my opinion, it comes from the code from another package and should not be recognized as an API. It's very easy to be obfuscated so I didn't put methods like this into new_dict.dat. Most of Android System API begins with "Landroid".

By the way, I will update the whole project and prepare to add automatically updating to database later this year (for my graduation project). More detailed information and functionality will be added.

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

However adding a new library is still difficult.

I definitely agree :)

I would appreciate that if you could give me a list of libraries that do not appear in my database.

Whenever I find any. Until now, that's the two mentioned above:

  • AndroidProxySetter uses android-proxy which is not detected. Relevant strings in my previous post. pn for this library would be be/shouldit/proxy/lib.

  • qutelauncher uses Firebase Analytics (pn: com/google/firebase). Corresponding findings from Smali:

    Lcom/google/firebase/analytics/FirebaseAnalytics;->getInstance(
    Lcom/google/firebase/messaging/FirebaseMessagingService;-><init>(
    Lcom/google/firebase/messaging/RemoteMessage;->getData(
    Lcom/google/firebase/messaging/RemoteMessage;->getData(
    Lcom/google/firebase/messaging/RemoteMessage;->getFrom(
    Lcom/google/firebase/messaging/RemoteMessage;->getNotification(
    Lcom/google/firebase/messaging/RemoteMessage;->getNotification(
    Lcom/google/firebase/messaging/RemoteMessage$Notification;->getBody(
    
  • qutelauncher also uses GMS (pn: com/google/android/gms), which is not reported (seems to be used by Firebase (my guess would be those "RemoteMessage" calls), not by the app directly – and the resulting list would be pretty long; too long to be included here). Strange that it's not reported, as it is already known to LibRadar. If you want to check, grab the *full.apk from behind the link.

In my opinion, it comes from the code from another package

I doubt that, but I might be wrong: I've limited my grep to just the application package directory itself (i.e. I did a cd tk.elevenk.proxysetter_0.2/smali/tk first), to avoid having the library's own "inner calls" recorded along.

By the way, I will update the whole project and prepare to add automatically updating to database later this year (for my graduation project). More detailed information and functionality will be added.

That sounds great! Fingers already crossed for your graduation!

from libradar.

pkumza avatar pkumza commented on August 23, 2024

I will try to add this functionality before 4th March. After implementation, I'll send notification to your twitter. ^_^

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

Uh. Hadn't you closed this I'd said you simply could close it when done, so I get a notification from Github…

from libradar.

pkumza avatar pkumza commented on August 23, 2024

I've added the function.
I will make some test cases to make sure it works and add documents about how to add a new lib this week.

from libradar.

IzzySoft avatar IzzySoft commented on August 23, 2024

Thanks! Looks like soon it's time I try the new version then. OTOH, seeing the install instructions, I'm afraid it won't be that soon; it got too many dependencies. I prefer if things either come straight from the repositories, or run "out of their directory". Having to install self-compiled stuff via "make install" (Redis 3.2, as the repos only hold Redis 3.0) plus things via pip/pypi (which isn't installed itself even on my machine) is not my first choice ;)

Does it already return JSON the way it did before? Or did the format change?

from libradar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.