Author: Roberto Rodriguez (@Cyb3rWard0g)
Notes: Download this notebook and use it to connect to your own Elasticsearch database. The BinderHub project might not allow direct connections to external entities on port 9092
Using Elasticsearch DSL¶
pip install elasticsearch
pip install pandas
pip install elasticsearch-dsl
from elasticsearch import Elasticsearch from elasticsearch_dsl import Search import pandas as pd
Initialize an Elasticsearch client¶
Initialize an Elasticsearch client using a specific Elasticsearch URL. Next, you can pass the client to the Search object that we will use to represent the search request in a little bit.
es = Elasticsearch(['http://<elasticsearch-ip>:9200']) searchContext = Search(using=es, index='logs-*', doc_type='doc')
Set the Query Search Context¶
In addition, we will need to use the query class to pass an Elasticsearch query_string . For example, what if I want to query event_id 1 events?.
s = searchContext.query('query_string', query='event_id:1')
Run Query & Explore Response¶
Finally, you can run the query and get the results back as a DataFrame
response = s.execute() if response.success(): df = pd.DataFrame((d.to_dict() for d in s.scan())) df
Using HuntLib (@DavidJBianco)¶
pip install huntlib
from huntlib.elastic import ElasticDF
Create a plaintext connection to the Elastic server, no authentication
e = ElasticDF( url="http://localhost:9200" )
A more complex example, showing how to set the Elastic document type, use Python-style datetime objects to constrain the search to a certain time period, and a user-defined field against which to do the time comparisons. The result size will be limited to no more than 1500 entries.
df = e.search_df( lucene="item:5285 AND color:red", index="myindex-*", doctype="doc", date_field="mydate", start_time=datetime.now() - timedelta(days=8), end_time=datetime.now() - timedelta(days=6), limit=1500 )