Analyzing Windows RPC Methods & Other Functions Via GraphFrames

  • Author: Roberto Rodriguez (@Cyb3rWard0g)

  • Project: Infosec Jupyter Book

  • Public Organization: Open Threat Research

  • License: Creative Commons Attribution-ShareAlike 4.0 International

  • Reference:

Import Libraries

from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from graphframes import *

Initialize Spark Session

spark = SparkSession \
    .builder \
    .appName("WinRPC") \
    .config("spark.sql.caseSensitive","True") \
    .config("spark.driver.memory", "4g") \
    .getOrCreate()
spark

SparkSession - in-memory

SparkContext

Spark UI

Version
v3.0.0
Master
local[*]
AppName
WinRPC

Download and Decompress JSON File

! wget https://github.com/Cyb3rWard0g/WinRpcFunctions/raw/master/win10_1909/AllRpcFuncMaps.zip
--2020-07-21 15:01:41--  https://github.com/Cyb3rWard0g/WinRpcFunctions/raw/master/win10_1909/AllRpcFuncMaps.zip
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/Cyb3rWard0g/WinRpcFunctions/master/win10_1909/AllRpcFuncMaps.zip [following]
--2020-07-21 15:01:41--  https://raw.githubusercontent.com/Cyb3rWard0g/WinRpcFunctions/master/win10_1909/AllRpcFuncMaps.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26891116 (26M) [application/zip]
Saving to: ‘AllRpcFuncMaps.zip’

AllRpcFuncMaps.zip  100%[===================>]  25.64M  4.33MB/s    in 6.1s    

2020-07-21 15:01:47 (4.22 MB/s) - ‘AllRpcFuncMaps.zip’ saved [26891116/26891116]
! unzip AllRpcFuncMaps.zip
Archive:  AllRpcFuncMaps.zip
  inflating: AllRpcFuncMaps.json     

Read JSON File as Spark DataFrame

%%time
df = spark.read.json('AllRpcFuncMaps.json')
CPU times: user 9.34 ms, sys: 5.12 ms, total: 14.5 ms
Wall time: 1min 8s

Create Temporary SQL View

df.createOrReplaceTempView('RPCMaps')

Create GraphFrame

vertices = spark.sql(
'''
SELECT FunctionName AS id, FunctionType, Module
FROM RPCMaps
GROUP BY FunctionName, FunctionType, Module
'''
)
edges = spark.sql(
'''
SELECT CalledBy AS src, FunctionName AS dst
FROM RPCMaps
'''
).dropDuplicates()
g = GraphFrame(vertices, edges)
g
GraphFrame(v:[id: string, FunctionType: string ... 1 more field], e:[src: string, dst: string])

Motif Finding

Motif finding refers to searching for structural patterns in a graph.

GraphFrame motif finding uses a simple Domain-Specific Language (DSL) for expressing structural queries. For example, graph.find(“(a)-[e]->(b); (b)-[e2]->(a)”) will search for pairs of vertices a,b connected by edges in both directions. It will return a DataFrame of all such structures in the graph, with columns for each of the named elements (vertices or edges) in the motif

Basic Motif Queries

What about a chain of 3 vertices where the first one is an RPC function and the last one is an external function named LoadLibraryExW?

loadLibrary = g.find("(a)-[]->(b); (b)-[]->(c)")\
  .filter("a.FunctionType = 'RPCFunction'")\
  .filter("c.FunctionType = 'ExtFunction'")\
  .filter("c.id = 'LoadLibraryExW'").dropDuplicates()
%%time
loadLibrary.select("a.Module","a.id","b.id","c.id").show(10,truncate=False)
+---------------------------------------+----------------------------------------+----------+--------------+
|Module                                 |id                                      |id        |id            |
+---------------------------------------+----------------------------------------+----------+--------------+
|c:/Windows/System32/appinfo.dll        |RAiLaunchProcessWithIdentity            |Open      |LoadLibraryExW|
|C:/Windows/System32/UserDataService.dll|UdmSvcImpl_GetContactRevisionEnum       |Initialize|LoadLibraryExW|
|c:/Windows/System32/lsm.dll            |RpcWaitAsyncNotification                |Initialize|LoadLibraryExW|
|c:/Windows/System32/lsm.dll            |RpcWaitAsyncNotification                |Initialize|LoadLibraryExW|
|C:/Windows/System32/PhoneService.dll   |PhoneSvcImpl_PhoneRpcGetShouldMuteKeypad|Initialize|LoadLibraryExW|
|C:/Windows/System32/UserDataService.dll|UdmSvcImpl_ToggleContactMaintenance     |Initialize|LoadLibraryExW|
|C:/Windows/System32/UserDataService.dll|UdmSvcImpl_EmptyEmailFolder             |Initialize|LoadLibraryExW|
|C:/Windows/System32/UserDataService.dll|UdmSvcImpl_EmptyEmailFolder             |Initialize|LoadLibraryExW|
|c:/Windows/System32/vpnike.dll         |VpnikeCreateIDPayload                   |Initialize|LoadLibraryExW|
|c:/Windows/System32/vpnike.dll         |VpnikeCreateIDPayload                   |Initialize|LoadLibraryExW|
+---------------------------------------+----------------------------------------+----------+--------------+
only showing top 10 rows

CPU times: user 6.63 ms, sys: 3.24 ms, total: 9.87 ms
Wall time: 37.8 s

What if we also filter our graph query by a specific module? What about Lsasrv.dll?

loadLibrary = g.find("(a)-[]->(b); (b)-[]->(c)")\
  .filter("a.FunctionType = 'RPCFunction'")\
  .filter("lower(a.Module) LIKE '%lsasrv.dll'")\
  .filter("c.FunctionType = 'ExtFunction'")\
  .filter("c.id = 'LoadLibraryExW'").dropDuplicates()
%%time
loadLibrary.select("a.Module","a.id","b.id","c.id").show(10,truncate=False)
+------------------------------+----------------------------------+-------------------------+--------------+
|Module                        |id                                |id                       |id            |
+------------------------------+----------------------------------+-------------------------+--------------+
|c:/Windows/System32/lsasrv.dll|DsRolerGetPrimaryDomainInformation|LsapDbOpenObject         |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarQueryTrustedDomainInfoByName  |LsapLoadLsaDbExtensionDll|LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarOpenPolicy2                   |LsapDbOpenObject         |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|DsRolerGetPrimaryDomainInformation|LsapDbOpenObject         |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarCreateSecret                  |LsapDbDereferenceObject  |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarEnumerateAccountsWithUserRight|LsapDbDereferenceObject  |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarLookupSids                    |LsapLookupSids           |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarQueryTrustedDomainInfoByName  |LsapDbOpenObject         |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarSetTrustedDomainInfoByName    |LsapDbDereferenceObject  |LoadLibraryExW|
|c:/Windows/System32/lsasrv.dll|LsarOpenAccount                   |LsapLoadLsaDbExtensionDll|LoadLibraryExW|
+------------------------------+----------------------------------+-------------------------+--------------+
only showing top 10 rows

CPU times: user 4.95 ms, sys: 2.65 ms, total: 7.6 ms
Wall time: 23 s

Breadth-first search (BFS)

Breadth-first search (BFS) finds the shortest path(s) from one vertex (or a set of vertices) to another vertex (or a set of vertices). The beginning and end vertices are specified as Spark DataFrame expressions.

Shortest Path from an RPC Method to LoadLibraryExW

loadLibraryBFS = g.bfs(
  fromExpr = "FunctionType = 'RPCFunction'",
  toExpr = "id = 'LoadLibraryExW' and FunctionType = 'ExtFunction'",
  maxPathLength = 3).dropDuplicates()
%%time
loadLibraryBFS.select("from.Module", "e0").show(10,truncate=False)
+--------------------------------------+--------------------------------------------+
|Module                                |e0                                          |
+--------------------------------------+--------------------------------------------+
|C:/Windows/System32/appmgmts.dll      |[ARPRemoveApp, LoadLibraryExW]              |
|c:/Windows/System32/nlasvc.dll        |[operator(), LoadLibraryExW]                |
|c:/Windows/System32/lsasrv.dll        |[LsarQueryInformationPolicy, LoadLibraryExW]|
|C:/Windows/System32/tellib.dll        |[operator(), LoadLibraryExW]                |
|C:/Windows/System32/tellib.dll        |[operator(), LoadLibraryExW]                |
|C:/Windows/System32/debugregsvc.dll   |[s_MergeEtlFiles, LoadLibraryExW]           |
|c:/Windows/System32/samsrv.dll        |[SamrCloseHandle, LoadLibraryExW]           |
|C:/Windows/System32/appmgmts.dll      |[GetManagedApps, LoadLibraryExW]            |
|C:/Windows/System32/debugregsvc.dll   |[s_MergeEtlFiles, LoadLibraryExW]           |
|C:/Windows/System32/WaaSMedicAgent.exe|[LoadPluginLibrary, LoadLibraryExW]         |
+--------------------------------------+--------------------------------------------+
only showing top 10 rows

CPU times: user 2.73 ms, sys: 1.58 ms, total: 4.31 ms
Wall time: 13.5 s