Under the Mighty Hood of TheHive 4

We have been speaking about it for almost two years. We have been making it for more than twelve months. And the day (or rather the month in this case) has almost come for TheHive 4, our latest and greatest version, to be unleashed.

While the first release candidate should be published by the end of this month, we would like to cover some of the most important changes we introduced in a platform which we rewrote almost from the ground up (40,000 lines of Scala code and counting), while keeping the familiar look&feel our longtime users came to expect.In a previous blog post, we covered TheHiveFS, a nifty feature of TheHive4 that allows you to quickly access all files stored in TheHive directly from your investigation machine. It’s time now to get a look under the hood of THeHive 4.

My Time is Precious. TL;DR Please

A picture is worth a thousand words, right? Here you go then!

The Hive 4’s Brand New Architecture

I am Puzzled, can you Elaborate a Bit?

So, you are not in a hurry anymore? Fine. Here, grab a seat, a glass of Gevrey-Chambertin and tasty Burgundy snails. All set? Let’s start then!

TheHive 4 will be the first version to use a graph database instead of Elasticsearch. Yes, you read that correctly. TheHive 4 won’t support Elasticsearch anymore but fear not fearless cyberdefender. Your friendly bees will not leave you hanging. If you are already using TheHive 3.4.x, we will provide a migration tool that will move your existing data to the new storage system (with no losses or bit flips hopefully).

We haven’t decided to ditch Elasticsearch on a whim or because Thomas (Franco, not Chopitea nor the General) dropped his leftist hipster attitude for a tight, tailor-made dictator uniform straight out of Spain. For all its greatness, ES has some annoying limitations which prevented us from adding, in an elegant, haiku-like way important features such as multi-tenancy, RBAC and large file management, while laying the ground for the future (stop being curious, the future has not been invented yet and when we do invent it, we’ll let you know).

Using JanusGraph, TheHive 4 structures information in graphs and stores them in an Apache Cassandra database. All the files that you attach to task logs or add as observables are stored in a Hadoop Distributed File System (HDFS).

Thanks to this brand new architecture, TheHive 4 is horizontally scalable. You can add as many TheHive, Cassandra and HDFS nodes to your Security Incident Response Platform cluster and sustain whatever load you might be facing without a sweat. Who said FOSS can’t be ‘enterprise grade’ (whatever that means in marketing lingo)?

Tour d’Horizon of the Main Features

TheHive 4, boosted by all the passion and skills of Zen Master Franco and MC Adouani, will support, in addition to TheHiveFS:

  • Multi-tenancy
  • RBAC
  • 2FA
  • Web configuration
  • API versioning

We will cover some of these features in greater detail in future instalments. In the meantime, let’s take a ride in a helicopter and view the wonderful landscape laying before us from above. After you Messieurs-Dames, we are French gentlemen and gallantry is of the essence (except when we use the public transportation in Paris, then savages we become).

Multi-Tenancy

As in Cortex, you will be able to create multiple organisations within a single instance of TheHive 4. In addition, an organisation can decide to share a case or parts of it (say a task, some observables, etc.) with other organisations. That way, a peer organisation or a constituent can contribute to the investigation at hand, provide essential information, etc.

RBAC

TheHive 4 supports a large set of user permissions. Some pertain to administrators, others to users and there are also permissions that apply to connectors. For example, users can manage tasks but not observables. They can have the power to share a case or part of it with sister organisations and execute Cortex analyzers but not responders.

You will be able to create roles for users, and, at the organisational level, what we call shares. RBAC deserves its own blog post and we’ll get to it pretty soon.

2FA

Do you really want us to describe this one? Before you answer yes, we’d like to remind you that you are in a helicopter. Just sayin’.

‘They asked me to explain 2FA. So I helped them out of the helicopter. It was flying way above ground.’
Source: Berserk, FNAC.com

Web Configuration

Tired of using vi, Emacs or your favourite CLI editor for making configuration changes to TheHive’s application.conf? Tired of restarting the service to take into account those modifications? Then you will certainly go dance kizomba with Nabil all night long when we tell you that you don’t need to use vi & service (or whatever the kids are using these days) anymore!

Thanks to the new architecture, all the configuration will be stored in the underlying database and you will be able to edit it using the WebUI. TheHive will automatically take the changes into account and you won’t need to restart it.

We can feel your love here. Merci !

API Versioning

TheHive 4 adds API versioning and it will maintain backward compatibility with TheHive 3.4.x without preventing us from adding new features. TheHive4py will not be updated right away for TheHive 4 but thanks to the backward API compatibility, all existing feeders and programs that use the current version of TheHive4py will still work out of the box.

That’s all folks! Stay tuned for further news and, in the meantime, don’t be blue cuz’ the bees gonna take care of you.

TheHiveFS

TheHive Project’s Code Chefs, sweating under their toques, are working hard to deliver TheHive 4 as soon as feasible. The current target release date for the 1st release candidate (4.0-RC1) is Friday Feb 28, 2020.

While TheHive 4 will be the first release to support graph databases, multi-tenancy and Role-Based Access Control (RBAC), it will also have a nifty feature that can simplify the incident response and digital forensics workflows of our fellow cyberdefenders: TheHiveFS.

What is TheHiveFS?

Starting from TheHive 4, TheHive can be ‘mounted’ as a remote, WebDAV filesystem. The filesystem can be securely mounted if SSL/TLS is enabled.

Thanks to TheHiveFS, you can quickly access all files stored in TheHive directly from your investigation machine. This can speed up the time needed to triage and analyse evidence. 

What Types of Files Can I Access through TheHiveFS?

You can access, in read-only mode, all files attached to task logs and all observables which datatype is file, as long as you are allowed to do so. Indeed, TheHive 4 comes with RBAC so if, for example, you are not allowed to view a case or some file observables in a case, you won’t be able to access them using TheHiveFS, the same way as if you are using the WebUI.

Screenshot showing an analyst accessing file observables and files associated to tasks of case #40 using TheHiveFS

How Can I Mount TheHiveFS?

Assuming you have a WebDAV client, such as davfs2, use the following command line:

$ sudo mount -t davfs -o noexec https://myhiveinstance:9001/fs /mnt/dav/

You can also point your graphical file manager to:

dav(s)://myhiveinstance:9001/fs

You will need to authenticate using your username and password as if you were connecting to TheHive’s WebUI.

Mom, I’ve Just Stepped on a Landmine

Beware folks. When you download a file observable using TheHive’s WebUI, it will conveniently create a password-protected ZIP archive before handing you the file. This way, we avoid accidental double clicks that may lead to the infection and compromise of your workstation, which might reflect bad on you or force you to offer breakfast the next morning to all your fellow teammates.

There is no such protection if you use TheHiveFS. Let us repeat this so it sinks: there is no such protection if you use TheHiveFS.

If you mount TheHive’s filesystem and open by accident or by a great deal of will, as a true, hardcore fan of Russian roulette, a file observable that is in fact malware courtesy of your favourite bear, kitten, panda or eagle, you can’t blame your friendly bees. But we will empathise (and our empathy level is directly correlated to the amount of pains au chocolat you send our way).

You’ve been warned.

That Sounds Awesome! When Can I Try It?

As written above, you will be able to try TheHiveFS as soon as TheHive 4.0-RC1 is released and that’s currently planned for the end of February 2020.

You can cry, beg, try to bribe us with VC money, make the line at 3:00 AM in front of TheHive Store (there ain’t no such store, we are not Apple), this will not make us work any faster. But you can always cheer us up, hug us or just thank us. This means a lot to us and to the free, open source software flame we carry deep within our souls.

One More Thing…

While we aren’t Apple, we can mimic Steve to share one more information that will make TheHiveFS even more interesting by Q3-Q4 2020. We plan to add support for large file management in TheHive 4.1, the next major version after 4.0 as would Captain Obvious say. Thanks to this feature, you will be able to upload memory and disk images to TheHive and if your Internet line breaks, the upload will resume automatically. 

That’s all folks!

TheHive 3.4.0 & Cortex 3.0.0 Released

For many months, we have been concentrating our efforts on TheHive 4, the next major version of your favourite Security Incident Response Platform, which we’ll finally provide RBAC (or multi-tenancy if you prefer), a feature that Cortex had for quite some time now.

Source : dilbert.com © Scott Adams

As you well know, both TheHive and Cortex rely on Elasticsearch (ES) for storage. The choice of ES made sense in the beginning of the project but as we added additional features and had new ideas to give you the best experience possible, we faced several ES quirks and shortcomings that proved challenging if not outright blocking for making our roadmap a reality, including RBAC implementation in TheHive, a far more complex endeavour than RBAC in Cortex. Transitioning from ES to graph databases was necessary and since we want our existing users to have a smooth migration path, TheHive 4 (the first release candidate should come out of the oven by the end of the year) will support both ES and graph databases.

But while we were focusing on that, we completely lost sight of the end of life of ES 5.6 so we wrote an apology to you, our dear users, back in May.

Shortly after, we released TheHive 3.4.0-RC1, to add support for ES 6 (with all the breaking changes it has introduced). We also did the same for Cortex with the release of Cortex 3.0.0-RC3. We also took that opportunity to clear out some AngularJS technodebt we had.

We then asked you to take them for a spin and report back any bugs you find given that both versions had to support ES 5.6 and ES 6 to allow for proper migration.

After a few rounds of release candidates, we are pleased to announce the immediate availability of TheHive 3.4.0 and Cortex 3.0.0 as stable releases.

Before upgrading your existing software to these new versions, please make sure to read the blog post we wrote back in June. We invite you to pay great attention to the regressions that we were forced to introduce because of ES 6.

You should also note that, in addition to ES 6 support, Cortex 3.0.0 supports fully dockerised analyzers and responders. We’ll elaborate on this in a future blog post soon.

Changelogs

If you are interested in some nitty-gritty details, we invite you to read the relevant changelogs since our last post on the subject:

Running Into Trouble?

Shall you encounter any difficulty, please join our  user forum, contact us on Gitter, or send us an email at support@thehive-project.org. We will be more than happy to help as usual!

An Apology

Dear Users,

We owe you an apology. We thought we would never need to support Elasticsearch 7 or even 6. We thought we could stick with the latest version of Elasticsearch 5 as the underlying storage and indexing engine for TheHive and Cortex until we would be able to complete the transition to a graph database. Moving to such a database is a necessity for your favourite open source, free Security Incident Response Platform and its analysis and orchestration companion, a necessity that has grown out of our frustration with Elasticsearch and its limitations, with the breaking changes that ES 6 introduced which forbid a smooth transition and puts a significant toll on an open source initiative such as ours.

We initially thought we could complete the transition by October of last year and finally offer you long-desired features such as RBAC and multi-tenancy as well as establish a solid ground to implement some exciting ideas that would help you lower the barrier to entry for junior analysts, save more time and concentrate on your work instead of having to master copy/paste between various interfaces or moving from one tool to the other.

Sadly, things did not play out the way we wanted. As TheHive and Cortex were adopted by more and more organisations, feature requests kept piling up and being generous bees, we have always strived to keep our users happy within the confines of our limited resources. Certainly, our user community helped us significantly by contributing a huge number of analyzers to Cortex in no time, making the total amount fly past the 100 landmark. However, we had to rely mostly on ourselves for heavy-duty backend work while steadily releasing new versions to satisfy the appetite for capabilities that sounded reasonable and feasible within a realistic, acceptable timeframe. Multi-tenancy and RBAC also proved more complex than initially foreseen and since we hate a half-baked recipe (blame it on our French culture and our love for delicious food), we did not want to rush things out and add flimsy ‘patch’ code.

Source : https://kininaru-korean.net/archives/10305

So we focused on supporting graph databases and working on multi-tenancy and RBAC. You certainly noticed our silence these past weeks. And we completely lost sight of the end of life of ES 5.6 until we realised recently that it was no longer supported by Elastic, not even in critical bug fix mode. When ES 7 was released on April 10, the death sentence of ES 5.6 was pronounced and its coffin permanently nailed.

We know this is a lot to stomach. Welcome to the Upside Down! But remember: keep calm. Help is already on the way and hopefully this time around the cops will arrive before the movie is over. We are shifting our priorities to release new major versions of TheHive and Cortex in order to use a supported version of ES. This work should take a few weeks at least. In the meantime, if you are using TheHive and Cortex with their own, standalone ES instance and you have implemented sane network security measures to shield ES against unwanted remote access, you should be fine.

We also took the opportunity to look at what other external code we rely on and that would need to be updated as well, to avoid falling in the EOL trap again. Glad we looked! The current versions of TheHive and Cortex both use AngularJS 1.5 (here, take a stone and throw it the Hulk’s way on Nabil’s forehead). We are going to update our frontends to use AngularJS 1.7.

We will come up imminently with a concrete action plan to address our embarrassing miscalculation. Meanwhile, please accept our sincere apologies and rest assured that we won’t let you down.

ごめんなさい 🙏🏼

The Mind-Boggling Implications of Multi-Tenancy

TheHive offers a powerful yet generic query API for all the data stored by the platform in the underlying Elasticsearch database.

Thanks to its DSL (Domain Specific Language), TheHive can handle complex search queries such as the following:

Among all the unassigned tasks, show me all those associated with cases which severity is high but also contain the highest number of observables which datatype is  ‘mail’

When faced with such complex queries, TheHive translates them using its DSL and sends them over to Elasticsearch to obtain the results. TheHive’s dashboards draw their power from such querties.

And while such capability is highly desirable in our opinion, a capability that we will further leverage to add a completely revamped search module in the upcoming Cerana 1 (TheHive 3.1) release, it greatly complicates RBAC (or multi-tenancy) in TheHive.

Screen Shot 2018-06-27 at 11.50.39.png
A Sneak Peek at the New Search Module of the Upcoming Cerana 1 (TheHive 3.1) Release

Indeed, in the RBAC world, the conversion of any search queries submitted to TheHive into an Elasticsearch one is fully dependent on the user context. The user view must be kept within the boundaries of the group or groups to which they belong. Each search filter,  each search parameter, must return only the results that the user can view.

The data scope needs to be clearly identified at the case level. To perform a search against task logs for example, TheHive will need to identify the parent task log, then identify the parent case and only then verify the scope. This is no small undertaking.

Similarities across cases or alerts, such as the Related Cases feature or the relationships between a given alert and existing cases, would need additional work that has not been clearly identified at this stage. But the difficulties do not stop there. Any element that has no clear relationship with case entities will have to be singled out and specific code would need to be added to limit access according to the RBAC rules. This will be clearly the case for the audit trail. Also, what should TheHive display when an analyst group is working on a case that shares observables with another one belonging to a different group? Shall it allow a limited view without any details so that groups may request from a super administrator to authorize both groups to collaborate on the investigation, something that distributed CERTs or SOCs in a large corporation may desire? Or shall it keep the data completely isolated as MSSPs which serve multiple customers with a single instance will require? We know the answer: make it configurable. But take a step back and think of the implications at the code (and security) level.

Contrary to the feature we added to Cortex 2, which allow multiple organizations to use a single Cortex instance, multi-tenancy in TheHive is a much more complex feature to implement and which is expected to have a significant impact on the platform’s performance. It will also need extreme caution to avoid blind spots that attackers (and not so innocent tenants) may exploit to circumvent scope limitations and extend their view to data they are not supposed to access. That’s why we had to delay it to Cerana 2 (TheHive 3.2), currently planned for the end of October 2018.

If you are well versed in Elasticsearch and Scala and willing to help implement this feature, please contact us at support@thehive-project.org.