Project:AppAudit

From Wikibase Personal data
Revision as of 03:50, 10 August 2022 by Genferei (talk | contribs) (→‎8.9(Le Temps, Watson Actu))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This is a good place for User:Haixinshi to discuss his progress Podehaye (talk)

8.9(Le Temps, Watson Actu)

Progress:

  1. Discuss with Chengyang and Andreas about the plan of App Audit project.
  2. Write a report to explain the design of whole App Audit system in details and set up plans this week.

Thanks for app audit report. Could you also please include the diagram you showed us in signal (TheEyeBalls)

Questions:

What should the priority of current plans? Building up a coarse "Manager App" or try to explore Criteo? 1- a coarse "Manager App"

Plans:

  1. Develop a very coarse mobile app that manages data stored by modified apps, which is named ”Manager App”. It can read the data in public folders and send it to a simple http server.
  2. Work on the app le Monde to understand how criteo intervenes. but it seems complicated because there is no criteo sdk.
  3. Work on the app muslim pro which has the sdk of criteo.

8.8(Le Temps, Watson Actu)

Progress:

  1. Find the target functions of smartadserver.
  2. Talk to MP and confirm that the focus will be on Criteo.
  3. Summarize the work of app audit and prepare it for cooperation with Chengyang.

Questions:

No.

Plans:

  1. Work on the app le monde to understand how criteo intervenes. but it seems complicated because there is no criteo sdk.
  2. Work on the app muslim pro which has the sdk of criteo.

8.5(Le Temps, Watson Actu)

Progress:

  1. Optimize dynamic hooking scripts, now it can hook functions in batch and support filtering. I used Python scripts to retrieve class name(with package names) from path of files in decompile folders. In this case, I can get the class names in batch and hook classes in batch(in specific directories). I believe Frida-Server can support the same competence as the framework I used in ByteDance and my ex-leader told me they also use Frida now :)
  2. Google Ad still looks difficult. I hooked ALL functions that involves string related to “bid”, “currency” and “rtb”, but they are not called.
  3. Analyze the Tencent SDK and VK SDK in Watson Actu, which was proposed by MP. The following functions in Tencent SDK are called. But no function in VK SDK is called. I explored Tencent SDK and found that StatServiceImpl tries to track events.
  • com.tencent.wxop.stat.common.StatLogger@7f01327----#setDebugEnable is called, and the parameters are:

false

  • StatSpecifyReportedInfo [appKey=null, installChannel=null, version=null, sendImmediately=false, isImportant=false]----#toString is called, and the parameters are:

No Paramters!

  • StatSpecifyReportedInfo [appKey=null, installChannel=null, version=null, sendImmediately=false, isImportant=false]----#setAppKey is called, and the parameters are:

A9VH9B8L4GX4

Questions:

  1. Should I continue working on Google Ad? Or we should set it as long term goal and we first try more practical tasks.

Plans:

  1. Analyze ads in other apps that are not so complicated.
  2. Continue to work on Tencent SDK and VK SDK.

8.4(Le Temps, Watson Actu)

Progress:

  1. Find the way to show the price of ads in Le Temps. Here is a good place to inject prices of ads!
  2. Share and discuss the effects of showing ads in Hestia-Eyeballs Group.

Questions:

No

Plans:

  1. Optimize Scripts that apply Frida-Server API
  2. Analyze Google Ads in Watson Actu.

8.3(Le Temps)

Progress:

  1. Successfully built up Frida-Server pipeline. Without modifying the source code of app. I can dynamically output all data that flows in interesting functions. Besides, I can get calling relationships(stack trace back) dynamically.

Questions:

  • Question: MP-->Could please explain more what Frida could do?
  • Answer: Here explains that Frida can easily inject our logics into app's functions, which exposes the information that the functions are called or not(We could inject some logging code into target functions, and when they are called, our logs will show up); besides, we can show the parameters(data). All things I did before(modify SMALI code of apps -> compile the modified code -> build and sign app -> run the app -> watch console and find our logs to see what happens in a target function) CAN BE REPLACE by Frida Server. Because this tool can help us easily inject codes into an app without modifying the app even when the app is running(I submit the injected scripts and then results will be available right away)! To conclude, this tool can help us analyze the SDKs efficiently. But at the end, we still have to modify smali code to get our product after we find the target functions and understand the logits.

Plans:

  1. Try to find all class names by code or tool. And the class names can be fed into my scripts and then I can hook all functions in the app at one shot.
  2. Try to implement the effect of showing the price of advertisement on UI by finding the connection between UI and data processing functions in LeTemps+AppNexus. But it is not promising so far.

8.2(Le Temps)

Progress:

  1. Successfully built up Frida-Server in ROOT environment. It is very exciting, since:
  • Previously, when I want to verify that an interesting function is called or not, I have to first modified the corresponding smali code to Log something, then compile, then build the app and analyze logs on adb console, which would cost time;
  • Now, I can write JS code directly and hook target functions DYNAMICALLY, which means that I do not need to modify the source code of apps, and I can easily verify the functions being called and the data flowing in them.
  1. I can see the possibility that we can inject the SO file of Frida into the app and then we do not need to modify smali code. But this is not top priority here, since we already have a workable pipeline.
  2. I spent a lot of time in finding the connection between UI and data processing functions in LeTemps+AppNexus. But it seems strange for me. From my understanding so far, the data I collected before is NOT used for UI especially the "content" which contains a lot of HTML+JS code.

Questions:

  1. It is not very efficient for me to do static analysis(function names are obfuscated and the calling relationships are complex), but now dynamic hooking is very convinient for me; therefore, I can show you more data that flows in the app.

MP-->Could please explain more what Frida could do?

Plans:

  1. Now I can not dynamically hook A BATCH OF functions at a time. But I will try to fix the bugs and it is practical for me.
  2. Try to implement the effect of showing the price of advertisement on UI by finding the connection between UI and data processing functions in LeTemps+AppNexus. But it is not promising so far.

7.29(Le Temps, Watson Actu)

Progress:

  1. AdViewRequestManager. This class is pivot of Banner Advertisement for LeTemps+AppNexus.
  2. Try to show the price of advertisement on UI. Now I can dynamically analyze the calling relationshis for Banner Advertisement in LeTemps+AppNexus. They are more complex than my expectation, since they are not directly called one by one but apply event-driven mechanism, which means that there are listeners for the events. Besides, the codes are obfuscated.
  3. Try to use Android Studio, but it did not work.

Questions:

No.

Plans:

  1. Try to implement the effect of showing the price of advertisement on UI.

7.28(Le Temps, Watson Actu)

Progress:

  1. Save data in shared place outside SD card and deal with permission problems.
  2. Talk with MP about the pain spot of my work; Talk with Jacob, although we know later that we cannot start working on Uber currently.)
  3. Root one of the phones and read documents about frida.

Questions:

I am afraid of legal-related problems when it comes to APK's decompiling and modifying. What kind of behaviors are acceptable according to local laws?

Plans:

  1. !First priority!: Build up the Frida environment and learn to write js codes to dynamically hook functions, which can facilitate the process of finding CALLED interesting functions.
  2. Write static analysis about SDK connections in Le Temps.
  3. Dynamically test COMSCORE SDK, VKontakte SDK and Tencent SDK in Watson Actu, analyze location data of Huawai.

7.27(Le Temps, Watson Actu)

Progress:

  1. MP suggested me to have a look at: VKontakte SDK and Tencent SDK in Watson Actu:
check also for Huawai, in particular for location data
  • smali_classes2/com/vk/api/sdk/okhttp/OkHttpExecutor.smali and the function is ReadResponse. Unfortunately this function was not called during my experiments. I also tried other functions related to server response in VKontakte SDK. But they are not called.
  • I found that COMSCORE SDK was called very frequently in Watson Actu.
  • Bidding related functions from Google Ads mentioned on 7.26 are not called during my experiments.
  1. I finished saving detailed logs about LeTemps+AppNexus, which contains a lot of interesting information.

Questions:

  1. Always finding functions by purely viewing logging is in very low efficiency. I will try to use more advanced tools, like Android Studio, so as to dynamically debug the modified APK if applicable, which may efficiently help me seize what kind of functions are actually called. Would I have any technical supports related to this?
  2. During working, I spent a lot of time in testing that whether interesting functions are called or not while focusing on some interesting strings like "bidding", and "price". It can lead to low productivity and boredom.

Plans:

  1. Find more efficient tools or platforms to do dynamical analysis.
  2. Write static analysis about SDK connections in Le Temps.
  3. Dynamically test COMSCORE SDK, VKontakte SDK and Tencent SDK in Watson Actu.

7.26(Le Temps, Watson Actu)

Progress:

  1. Have tried to dynamically modify and test all SDK listed on 7.23, but they are not called(Almost all ads are from AppNexus, which has been done in last week). Though I can do static analysis, it is not guaranteed for working in the future and can not be visualized.
  2. Understand the advertisement mechanism in Le Temps.
  3. After reviewing the apps listed by MP, I selected Watson Actu, since it contains much more diverse types of advertisements(including banner, Video Ads, etc see here), and such ads are frequently showing in this app.
  4. Decompile Watson Actu and find bidding functions from Google Ads, though they are heavily obfuscated:
  • smali_classes2/com/google/android/gms/internal/ads/zzezz.smali: const-string v10, "bid_response"
  • smali_classes2/com/google/android/gms/internal/ads/zzfac.smali: const-string v11, "bidding_data"
  • smali_classes2/com/google/android/gms/internal/ads/zzdxq.smali: const-string v1, "biddingData"
  • smali_classes2/com/google/android/gms/internal/ads/zzbjl.smali: const-string v3, "gads:scar_trustless_token_for_gbid:enabled"
  • smali_classes2/com/google/android/gms/internal/ads/zzbjl.smali: const-string v2, "gads:inspector:bidding_data_enabled"

Questions:

  1. Is the static design analysis report desirable? I am afraid that such reports can offer little help in practice usage. My preference is to dynamically analyze and store data. But the biggest obstacle for me is to find the functions or related advertisement companies that will be frequently called or activated during app running, which will help me dynamically analyze them. I need to get such suitable apps.
  2. I need more concrete requirements, which can be from the experience of trying mobile apps. For example, you try to use some apps, and then you can find some data showing on UI that interests you. By receiving you interests, I can do more, which is not limited to advertisement bidding.

Plan:

  1. I try to focus on Google ads by analyzing Watson Actu, which is wildly used.

7.25(Le Temps)

Progress:

  1. √ Make corresponding methods that can parse and save JSON and other information. This point is very important for me to improve efficiency. Previously, I have to write all smali inside a target function, which would not only cause register overlapping and control flow error, but also introduce complexity since one line Java code can be compiled into multiple lines small code. After reaching this point, in the future, I only need to write java code, which is more efficient, and then transform an integration into target smali code. It can speed up a lot!
  2. × Try to dynamically test SDKs in terms of advertisement.(I was stuck by point 4)
  3. × Try to find and collect message flow in Le Temps in terms of advertisement network. (I was stuck by point 4)
  4. √ Let Storing Data Locally work in different environments. Currently Storing Data Locally can work well in my Android Emulator, but it is more tricky to adapt to other environments like the Samsung Phone Paul gave me. The reasons are:
  • If there is no SdCard, we have to store data into internal memory. However, if the mobile app is not rooted, it is hard to access to data in internal memory(Users can not access to this data, and only the app itself can).
  • Now, I can get the correct absolute address to store data in internal storage and external storage, which can be adapted to any mobile phones.(But it is not convenient for me to debug on Samsung Phone, since I can not view text easily). So I will use Android Emulator in next steps for efficiency.

Questions:

  1. Can I root the phone? rooting is not reversible and it is very common in dev teams.
  2. The problem about Saving Data Locally is unavoidable in the future. I have a basic idea about how to design it. We can create a floating window for the users(users can hide it of course). In this window, users can view what type of advertisement they are exposed to; users can choose and send data to our server for further processing. But I am not very sure about this, because it seems like a heavy modification for an APP. The Game Testing Team in ByteDance would choose to use another APP to dynamically inject into targeting APP.

Plans:

  1. Try to dynamically test SDKs in terms of advertisement.
  2. Try to find and collect message flow in Le Temps in terms of advertisement network.

7.23(Le Temps)

Progress:

"Dynamic" means that when we run our modified APP and there is an advertisement showing, then the modified functions will be called and we can store dynamic data(especially advertisement price information).

SDK: AppNexus(Dynamically Tested)

SDK:AppNexus:UTAdResponse We can dynamically access to the following information:

   private static final String RESPONSE_KEY_TAGS = "tags";
   private static final String RESPONSE_KEY_CONTENT = "content";
   private static final String RESPONSE_KEY_WIDTH = "width";
   private static final String RESPONSE_KEY_HEIGHT = "height";
   private static final String RESPONSE_KEY_PLAYER_WIDTH = "player_width";
   private static final String RESPONSE_KEY_PLAYER_HEIGHT = "player_height";
   private static final String RESPONSE_KEY_NO_BID = "nobid";
   private static final String RESPONSE_KEY_CREATIVE_ID = "creative_id";
   private static final String RESPONSE_KEY_ADS = "ads";
   private static final String RESPONSE_KEY_NOTIFY_URL = "notify_url";
   private static final String RESPONSE_KEY_CONTENT_SOURCE = "content_source";
   private static final String RESPONSE_KEY_CLASS = "class";
   private static final String RESPONSE_KEY_PARAM = "param";
   private static final String RESPONSE_KEY_PAYLOAD = "payload";
   private static final String RESPONSE_KEY_ID = "id";
   private static final String RESPONSE_KEY_UUID = "uuid";
   private static final String RESPONSE_KEY_HANDLER_URL = "url";
   private static final String RESPONSE_VALUE_ANDROID = "android";
   private static final String RESPONSE_KEY_TYPE = "type";
   private static final String RESPONSE_KEY_AD_TYPE = "ad_type";
   private static final String RESPONSE_KEY_HANDLER = "handler";
   private static final String RESPONSE_KEY_TRACKERS = "trackers";
   private static final String RESPONSE_KEY_IMPRESSION_URLS = "impression_urls";
   private static final String RESPONSE_KEY_CLICK_URLS = "click_urls";
   private static final String RESPONSE_KEY_ERROR_URLS = "error_urls";
   private static final String RESPONSE_KEY_TIMEOUT = "timeout_ms";
   private static final String RESPONSE_KEY_RESPONSE_URL = "response_url";
   private static final String RESPONSE_KEY_NO_AD_URL = "no_ad_url";
   private static final String RESPONSE_KEY_TAG_ID = "tag_id";
   private static final String RESPONSE_KEY_AUCTION_ID = "auction_id";
   private static final String RESPONSE_KEY_SECOND_PRICE = "second_price";
   private static final String RESPONSE_KEY_BUYER_MEMBER_ID = "buyer_member_id";
   private static final String RESPONSE_KEY_CPM = "cpm";
   private static final String RESPONSE_KEY_CPM_PUBLISHER_CURRENCY = "cpm_publisher_currency";
   private static final String RESPONSE_KEY_CPM_CURRENCY_CODE = "publisher_currency_code";

SDK: Magnite : Rubicon Advertising(No Dynamic Test)

It is not clear about what information we can collect from this API, but price for a banner ad can be obtained.

  1. RubiconHelper
  2. RubiconBanner

SDK: AmazonHB(No Dynamic Test)

We can collect price for an advertisement from Amazon:

  1. import com.amazon.device.ads.DTBAdResponse;

SDK: Smart AdServer(No Dynamic Test)

Bidding Example We can get bidding information from SASBiddingAdResponse.

  1. import com.smartadserver.android.library.headerbidding.SASBiddingAdResponse
  2. import com.smartadserver.android.library.headerbidding.SASBiddingFormatType
  3. import com.smartadserver.android.library.headerbidding.SASBiddingManager

SDK: Criteo(No Dynamic Test)

SDK Criteo : Bid involves price information. SDK Criteo : CdbResponseSlot involves a abundant information about bidding informtion:

   @SerializedName("impId") val impressionId: String? = null,
   @SerializedName("placementId") val placementId: String? = null,
   @SerializedName("zoneId") val zoneId: Int? = null,
   @SerializedName("cpm") val cpm: String = "0.0",
   @SerializedName("currency") val currency: String? = null,
   @SerializedName("width") val width: Int = 0,
   @SerializedName("height") val height: Int = 0,
   @SerializedName("displayUrl") val displayUrl: String? = null,
   @SerializedName("native") val nativeAssets: NativeAssets? = null,
   @SerializedName("ttl") var ttlInSeconds: Int = 0,

SDK: Google ads(No Dynamic Test):

The following two can be matched according to data type:

  1. SDK:google.android.gem.ads : AdValue pulic reference code
  2. SDK:google.android.gem.ads : AdValue obfuscated code
  • String getCurrencyCode() //The value's ISO 4217 currency code.
  • int getPrecisionType() //The precision type of the reported ad value.
  • long getValueMicros()//The ad's value in micro-units, where 1,000,000 micro-units equal one unit of the currency.

Questions:

No.

Plans:

  1. Try to make a class and corresponding methods that can parse and save JSON and other information.
  2. Try to dynamically test SDKs in terms of advertisement.
  3. Try to find and collect message flow in Le Temps in terms of advertisement network.