Need to tune ElasticSearch Cluster Performance.

Closed Posted 1 year ago Paid on delivery
Closed Paid on delivery

ELASTIC SEARCH INTEGRATION

IN SHORT

We expanded our big data pipeline with a hot storage layer buillt in top of ElasticSearch. We aimed to query data and have fast, very fast response time, and make fast analytical decisions but we have a bad performance. Data Indexing is very slow and Data query have long response time. We need an ELK expert who can help us fix that.

DATA AND FORMAT

Our Data (mainly text), currently stored in parquet format (in S3) and raw (TXT, CSV, XLSX etc.) format is around 10 TB and will grow exponentially.

CURRENT ARCHITECTURE

In our current architecture, we have

• A Spark cluster of 10 nodes (16 CPU, 64 GB RAM, 256GB Disk) to process raw files

• A s3 storage to store processed data in parquet format

• A PostgreSQL Database to store sessions, history and some meta-data.

• A web app built with Play framework (Scala) from which all requests (Spark jobs included) are triggered.

• (Non Optimized) Elastic search cluster of 5 nodes (16 CPU, 64 GB RAM, 700GB Disk). Indexation of ~170GB of data (~900 millions of rows) takes 5 hours.

OUR APPROACH AND PROBLEMS

1. After data transfrmation, we save resuting data in S3 (in parquet format).

2. Then we read these parquet files with spark in a dataframe.

3. Then we save this dataframe to ElasticSearch index (we tried many sharding and replication configuration mix without gaining in perfrmance)

4. We query/search data from ES and feed Kibana/Graphana or display it in any required format by business needs.

While the first two steps are relatively fast (~1hour for 1billion rows), the third step takes around 5 hours for a 170GB file.

And Data query has awful response time

OUR REQUIREMENTS

• Set up a very cost-effective and efficient ELK(ElasticSearch-Logstash,-Kibana) cluster (or Optimize our existing one)

• Provide (Code) an indexer that can process migration of existing data from s3 to Elastic Search

• Fast Indexing of documents Elastic Search

• Very Fast Queries and data retrieval. This is very important for our business needs. 1-3 seconds is acceptable response time

• Improve Spark Cluster Communication with ES cluster. Any bottleneck in communication between Spark and ES Cluster should be detect and fixed

PROFILE NEEDED

You need to:

• Have a strong experience with Elastic Search (ELK) in Big Data processing environment .

• Be comfortable with play framework (a least scala)

• Have good experience working with Spark

IMPORTANT CONSIDERATIONS.

• Data is about 10 TB and is quickly growing.

• Spark jobs are triggered from a web App built with play framework (SCALA)

• Need the project to be done in a reasonably short time (no more that one week).

• You need to connect to our Internal network in order to work. You will need to have a very good internet bandwidth and TeamViewer Application installed.

Elasticsearch Scala Spark Big Data Kibana

Project ID: #35094952

About the project

12 proposals Remote project Active 1 year ago

12 freelancers are bidding on average €600 for this job

h87md3h

Hello I'm an elasticsearch developer with 6 years of experience. I have development experience in lucene as well. I'd encourage you to check out my profile reviews section to know my elasticsearch project reviews

€750 EUR in 7 days
(4 Reviews)
5.1
AwaisChaudhry

Hey Good evening , Just finished reading the brief details and currently going through attached files . I see you have been looking for someone who has experience with these tech stacks Spark, Elasticsearch, Scala, Ki More

€750 EUR in 24 days
(1 Review)
3.4
neznamvladan499

✔✔✔✔ Nice to see your posting ✔✔✔✔ Hi, Rafik G.. I read your job posting and feel I can help you successfully complete your project now. I am good at Big Data, Elasticsearch, Scala, Kibana and Spark and I have complete More

€555 EUR in 3 days
(0 Reviews)
0.0
freelancerbonif2

Greetings Dear Client, I welcome you to my profile, where quality and client satisfaction is my top priority with 100% guarantee. I am Expert Boniface, CERTIFIED & VERIFIED freelancer. I'M AN EXPERT IN LISTED PROJECT More

€750 EUR in 3 days
(0 Reviews)
0.0
fayyazs789

Hi there

€420 EUR in 5 days
(0 Reviews)
0.0
AITSoft

Hi, I saw that you need help with Elasticsearch, Kibana, Big Data, Spark and Scala. I have 6 years of experience working on these frameworks. I believe I can help you with it. I would request you to have a look at my p More

€750 EUR in 12 days
(0 Reviews)
0.0
dataspro

Hello: After reading in detail the requirements of your project and concluding that they match my areas of knowledge and skills, I would like to introduce myself. My name is Anthony Muñoz and I am the lead engineer More

€670 EUR in 7 days
(0 Reviews)
0.0
RomanRut

Greeting! I've carefully gone through your job and I am interested in your job because I have a lot of experiences in various data science projects like computer vision. Here is my academic and professional experienc More

€250 EUR in 3 days
(0 Reviews)
0.0
vadbersten

Hi there, thanks for your job posting and hope you are doing well As a senior software engineer, I can help you with this kind of task related to Elastic Stack. Especially, I offer you my 8 years of experience in the D More

€500 EUR in 7 days
(0 Reviews)
0.0
Valuesolutions

Hello, I hope this finds you well. I have just seen your project requiring; Scala Elasticsearch Spark Kibana Big Data I believe that my 10-year experience in this field is what you need right away. Avoid the headache More

€500 EUR in 7 days
(0 Reviews)
0.0