Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hikvision File System parser - DVR videos #1776

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

gfd2020
Copy link
Collaborator

@gfd2020 gfd2020 commented Jul 17, 2023

Allows you to index videos recorded on a DVR with the Hikvision file system.
This PR will need a lot of work. It turns off several flags to function and modifies the program's operation, causing malfunctions in the normal use of the IPED.

@lfcnassif
Copy link
Member

Thanks @gfd2020 for this contribution. I'm changing this to draft until it is ready.

@paulobreim
Copy link

3 years ago, I needed to research hikvision to try to recover images from a HD that had been formatted.

There is no official documentation and the formatting is hikvision's own.

I managed to find some software (paid) to recover the videos, but none of them have recovery of the Logs, which for an expert purpose is extremely important.
I managed to gather some technical information and ended up making a program in C that recovers the Logs.
All material is available and I can help in any way.
I also have some material I collected on the internet and also emails from hikvision staff who could perhaps provide some documentation.

paulo

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Jul 20, 2023

3 years ago, I needed to research hikvision to try to recover images from a HD that had been formatted.

There is no official documentation and the formatting is hikvision's own.

I managed to find some software (paid) to recover the videos, but none of them have recovery of the Logs, which for an expert purpose is extremely important. I managed to gather some technical information and ended up making a program in C that recovers the Logs. All material is available and I can help in any way. I also have some material I collected on the internet and also emails from hikvision staff who could perhaps provide some documentation.

paulo

Hi Paulo.

The implementation of this PR is based on the document below. The values ​​of the logs are not being analyzed, but from what I've seen, it's ok to get the values ​​of the log in the RATS fields. What you could help with is the interpretation of the RATS field descriptions. I looked at the HEX and I didn't understand the logic of the description.

https://eudl.eu/pdf/10.1007/978-3-319-25512-5_13

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 12, 2023

Hi @lfcnassif .

Could you help me with the questions below so I can proceed with this PR?

  1. As processing needs to change several IPED parameters, would it be feasible to create a new type of profile? For example, 'dvr'?

  2. SleuthkitInputStreamFactory.java needs to be update on line 'this.emptyContent = true;' to 'this.emptyContent = false;'. Is it plausible to create a configuration variable? In which configuration file would it be viable?

  3. FragmentLargeBinaryTask is always enabled. However, it is necessary to disable this flag. Would it be possible to re-enable this flag?

  4. If this PR's task has some configuration parameters, should these parameters be in a new config file? For example: DVRTaskConfig.txt ?

  5. Do you think the implementation is going in a good direction or would it need reengineering? Perhaps in the future other types of file systems from other DVRs will be added ( or less common filesystems not identified by sleuthkit ) .

@patrickdalla
Copy link
Collaborator

Wouldn't it be more appropriate to implement this kind of FS as an DataSourceReader?

@lfcnassif
Copy link
Member

Hi @gfd2020, sorry for my delay, currently I'm traveling on vacation...

  1. Not sure if it is the best approach, maybe there is a better solution...
  2. A while ago, I thought about creating another abstract method in AbstractTask returning a boolean to tell if the task would process ordinary items/files or raw disks/partitions/FileSystems. If true, the the raw content of disks, partitions or file systems would be available to be processed by the task. This could be useful to this feature and to other future ones. Some changes in the code base would be needed to differentiate ordinary files from raw disks, partitions or FSes.
  3. Unfortunatelly, today, it needs to be always enabled to avoid Aborting ArrayIndexOutOfBoundsException while indexing huge files #1281. A possible solution for you would be to skip fragments, there is a flag tagging them. AFAIK you can access the content of original non splitted data normally.
  4. If the parameters are specific of this new Task, creating such config file seems fine to me.
  5. I'm not sure since I didn't have time to look at the proposed code changes here yet, sorry... I'll try to take a look after I return from vacation. Ideally this kind of support would fit better if implemented directly into Sleuthkit project...

@lfcnassif
Copy link
Member

Wouldn't it be more appropriate to implement this kind of FS as an DataSourceReader?

This makes sense. But we would lost the ability to read this file system raw data from E01, Ex01, VMDK, VHD, VHDX and other image disk formats, and also to decode MBR and GPT partitions transparently, since TSK decoding would be skipped... As I said, the perfect approach would be to implement this into TSK, but I think it would need much more effort.

@patrickdalla
Copy link
Collaborator

patrickdalla commented Sep 13, 2023 via email

@lfcnassif
Copy link
Member

Today, what does happen if a Vhd, an Ufdr or and Dd image is found inside the original evidence passed to IPED? How their content are processed?

They are passed to and decoded by SleuthkitReader through EmbeddedDiskProcessTask, except UFDR which is not processed recursively but as a common zip file.

@patrickdalla
Copy link
Collaborator

Couldn't these Hikvision partitions/disks be passed to be decoded by some HikvisionReader through the same EmbeddedDiskProcessTask?

@lfcnassif
Copy link
Member

Couldn't these Hikvision partitions/disks be passed to be decoded by some HikvisionReader through the same EmbeddedDiskProcessTask?

They could. But not sure if a HikVisionReader would be used directly in practice (like SleuthkitReader), since I think a DVR FS usually is into a partition (could you confirm @gfd2020?) that needs to be decoded first by TSK, so I think this would create another step for most cases.

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 13, 2023

They could. But not sure if a HikVisionReader would be used directly in practice (like SleuthkitReader), since I think a DVR FS usually is into a partition (could you confirm @gfd2020?) that needs to be decoded first by TSK, so I think this would create another step for most cases.

In the real cases I have, there is no disk initialization. The first sector of the disk is all zero (that is, without a partition system). The HikivisionFS starts in the second sector. Would this help with Patrick's suggestion?

@patrickdalla
Copy link
Collaborator

So, neither SleuthKit would show this as an item to be processed, and it seems that this "item" to be processed should first be carved from the unallocated space (as it would be identified by SleuthKit), right?

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 13, 2023

So, neither SleuthKit would show this as an item to be processed, and it seems that this "item" to be processed should first be carved from the unallocated space (as it would be identified by SleuthKit), right?

Sleuthkit will complain that it does not recognize this item. So, Nassif gave me the tip to change a flag in the SleuthkitInputStreamFactory file to continue decoding the image bytes. From then on, as far as I remember, IPED fragmented everything and looked for carved files.

IPED 4.1.4 log . Defaut config. Result:

Decoding image C:\HVFS\test.E01
org.sleuthkit.datamodel.TskCoreException: Errors occurred while ingesting image

  1. Cannot determine file system type (Sector offset: 0)

    at org.sleuthkit.datamodel.SleuthkitJNI.runAddImgNat(Native Method)
    at org.sleuthkit.datamodel.SleuthkitJNI$CaseDbHandle$AddImageProcess.run(SleuthkitJNI.java:584)
    at org.sleuthkit.datamodel.SleuthkitJNI$CaseDbHandle$AddImageProcess.run(SleuthkitJNI.java:544)
    at iped.engine.datasource.SleuthkitReader.addImageBlocking(SleuthkitReader.java:600)
    at iped.engine.datasource.SleuthkitReader$1.call(SleuthkitReader.java:653)
    at iped.engine.datasource.SleuthkitReader$1.call(SleuthkitReader.java:650)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

@lfcnassif
Copy link
Member

lfcnassif commented Sep 13, 2023

Would this help with Patrick's suggestion?

Yes, thanks!

it seems that this "item" to be processed should first be carved from the unallocated space (as it would be identified by SleuthKit), right?

I think so. Not sure, but I think @gfd2020 draft is reading data from the parent image.

So it seems possible to create a HikVisionReader to decode DD images. For all others, like e(x)01, vmdk and vhd(x), they would need to pass through SleuthkitReader first. Honestly, I'm not sure which approach is better, creating a DatasourceReader or a Task, I think both would work, but the first would need an additional change in EmbeddedDiskProcessTask.

@paulobreim
Copy link

I have some documentation of the format of this HD.
It could be treated like any other OS using specific development for this, since as far as I researched, there is no open software on the market.
The Hikvision Format has a formatting identification header, has some areas with information on the date format, HD size, an event log area and video addressing.
There is some software on the market that extracts videos, such as diskinternal. I looked at several and they all provided information a little differently from each other, and none of them extract the event LOG. The biggest discrepancy I found between the software I tested was the date and time of recording the videos. But the HD I had to work on had been formatted and perhaps this had affected it in some way.
I made a program in C that lists the event logs, which is a very important part of the inspection, and it worked perfectly, including showing logs that the hikvision software itself did not show.
Looking at the documentation I located the point where the logs start and end, but for testing purposes, I looked at the entire HD and found more logs.

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 14, 2023

I have some documentation of the format of this HD. It could be treated like any other OS using specific development for this, since as far as I researched, there is no open software on the market. The Hikvision Format has a formatting identification header, has some areas with information on the date format, HD size, an event log area and video addressing. There is some software on the market that extracts videos, such as diskinternal. I looked at several and they all provided information a little differently from each other, and none of them extract the event LOG. The biggest discrepancy I found between the software I tested was the date and time of recording the videos. But the HD I had to work on had been formatted and perhaps this had affected it in some way. I made a program in C that lists the event logs, which is a very important part of the inspection, and it worked perfectly, including showing logs that the hikvision software itself did not show. Looking at the documentation I located the point where the logs start and end, but for testing purposes, I looked at the entire HD and found more logs.

Hi @paulobreim . I will make a change to the PR now for greater compatibility. Could you please try to improve log detection?

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 14, 2023

Would this help with Patrick's suggestion?

Yes, thanks!

it seems that this "item" to be processed should first be carved from the unallocated space (as it would be identified by SleuthKit), right?

I think so. Not sure, but I think @gfd2020 draft is reading data from the parent image.

So it seems possible to create a HikVisionReader to decode DD images. For all others, like e(x)01, vmdk and vhd(x), they would need to pass through SleuthkitReader first. Honestly, I'm not sure which approach is better, creating a DatasourceReader or a Task, I think both would work, but the first would need an additional change in EmbeddedDiskProcessTask.

After a little more work, I managed to get it working without changing any configuration parameters. In fact, you don't need to touch any of them. Basically I created a new constructor for the SleuthkitInputStreamFactory class to be able to use the decoded content of e01 (no external variable will be needed anymore). The main change that could impact the program in general is in the SleuthkitReader class. From what I tested, IPED worked normally on other types of images. However, we need to do regression tests. I'm finishing the changes and will commit. @lfcnassif , could you take a look when you get back from vacation? Thanks.

parseUnknownFiles should preferably be disabled. If turned on, it may take longer to process
@lfcnassif
Copy link
Member

Hi @gfd2020. Since you said you would like to make changes for greater compatibility, I talked to @gbatmobile, a colleague from work who, together with one of his students, implemented DVR WFS file system support into a Sleuthkit fork, and also in python as a standalone tool. Here are the links he sent to me, perhaps they may help:
https://github.com/gbatmobile/sleuthkit4.9.0-wfs
https://github.com/gbatmobile/wfs-python

I don't know anythink about DVR file systems (I have never had the need to decode one), so possibly they are complete different file systems. I'm also not sure if they share anything in common. If not, please ignore this post, maybe WFS support might be added in the future as a separate improvement...

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 16, 2023

I have some more documentation regarding the HD layout, do you need it?

Hi @paulobreim , If your documentation has new information than below, yes please.
https://eudl.eu/pdf/10.1007/978-3-319-25512-5_13

I ran your code in my test case but there were no results.
From what I saw of your code, the log text information field still comes with some type of encoding (the same as what I implemented). It has pure text and some hexadecimals. As I didn't find any information about this, the only way would be to take a real case and see what the DVR shows on its interface.

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 16, 2023

Hi @gfd2020. Since you said you would like to make changes for greater compatibility, I talked to @gbatmobile, a colleague from work who, together with one of his students, implemented DVR WFS file system support into a Sleuthkit fork, and also in python as a standalone tool. Here are the links he sent to me, perhaps they may help: https://github.com/gbatmobile/sleuthkit4.9.0-wfs https://github.com/gbatmobile/wfs-python

I don't know anythink about DVR file systems (I have never had the need to decode one), so possibly they are complete different file systems. I'm also not sure if they share anything in common. If not, please ignore this post, maybe WFS support might be added in the future as a separate improvement...

Hi @lfcnassif .

Thanks. I think this could really help with integration. As for DVR FS's, from what I've seen, they don't have many features, basically a way to save videos and save metadata (channel, date and eventually logs).

If I'm not mistaken, as you said before, there are two ways to do the integration. Modify the Sleuthkit or forward the content to handle the FS in IPED. Both have pros and cons.

Modifying the Sleuthkit would be ideal as you wouldn't even need to change the IPED. However, this modification will not be simple and if the main Sleuthkit is updated, you have to check that the integrated module has not become incompatible. And this last step will have to be done for each new FS that is integrated.
Let's look at WFS. It is integrated into Sleuthkit 4.9. IPED is on 4.11, right? We would have to merge the versions, right?

Personally (I could be completely wrong), I think it would be easier to integrate with IPED and minimally change Sleuthkit (since it is updated frequently).
For this, I think IPED would work the same as it does today, giving an error about an unrecognized FS. If that FS is on an exception list, then IPED would pass the data on for internal processing. If this FS is implemented in the future in Sleuthkit, just remove it from the list. I think that would be easier.
I did it in this PR. I had to turn off fragmentation, entropy and unknown files taks to speed up the process (but it's not mandatory).
Even WFS would be almost ready, it would just be a matter of translating your colleague's Python code to Java in a TASK.

Anyway, I can try to integrate your colleague's code from WFS Sleuthkit into IPED, since it's ready (and it would be a proof of concept for hivision), but I believe it won't be easy or if I will be able to do this.

@lfcnassif
Copy link
Member

Hi @gfd2020. The approach to integrate into Sleuthkit would need to be done into its official project, exactly to avoid we having to maintain a more complex fork than we have today and to apply patches on future TSK versions. But we have no clue if a PR to official TSK would be accepted or not. I have sent some simple ones to fix critical bugs that took many months to be accepted, after I complained a lot about 0 feedback... I agree this ideal path would be harder. Maybe we could ask/open an issue in TSK asking if they have interest in this feature, if you think this path is worth and if you still have the needed effort/time available, I leave the decision to you.

Let's look at WFS. It is integrated into Sleuthkit 4.9. IPED is on 4.11, right? We would have to merge the versions, right?

I think we are in 4.12, not sure. But there is no need to implement WFS support now, don't worry. I just sent the links because maybe they could help. Seems it's a different file system and supporting it, if desired, can be left as a future improvement.

@paulobreim
Copy link

I have some more documentation regarding the HD layout, do you need it?

Hi @paulobreim , If your documentation has new information than below, yes please. https://eudl.eu/pdf/10.1007/978-3-319-25512-5_13

I ran your code in my test case but there were no results. From what I saw of your code, the log text information field still comes with some type of encoding (the same as what I implemented). It has pure text and some hexadecimals. As I didn't find any information about this, the only way would be to take a real case and see what the DVR shows on its interface.

This document you sent is what I have and the program is exactly what is in item 2.2 System Logs.
The document says that the Logs area has the following information:
0x260 offset system log (00 32 D1 03 00 00 00 00)
0x268 system log size (00 2C F4 00 00 00 00 00)
From the beginning of the log area there is the sequence (52 41 54 53 01 00 00 00) for each LOG, but I found this same sequence in areas different from that described in the documentation, which is why the program scans the entire disk.
In the image I used in 2020, all of these Logs were valid, so they did not match the documentation.
Unfortunately I no longer have the image of a hikvision to be able to do other tests.

I'm going to try to find HDs, on Rua Santa Efigenica, in São Paulo, related to Hikvision to do tests because I don't know if they modify anything depending on the model.
At the same time, I'm trying to speak directly to the manufacturer to see if they provide any documentation.

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 16, 2023

This document you sent is what I have and the program is exactly what is in item 2.2 System Logs. The document says that the Logs area has the following information: 0x260 offset system log (00 32 D1 03 00 00 00 00) 0x268 system log size (00 2C F4 00 00 00 00 00) From the beginning of the log area there is the sequence (52 41 54 53 01 00 00 00) for each LOG, but I found this same sequence in areas different from that described in the documentation, which is why the program scans the entire disk. In the image I used in 2020, all of these Logs were valid, so they did not match the documentation. Unfortunately I no longer have the image of a hikvision to be able to do other tests.

Hi @paulobreim . I think I couldn't explain it to you. I am able to read all the logs, types and created time. However, what is not clear is the text inside the log itself. The field that starts after the log type. Some random text and a bunch of hexadecimals appear, probably flags.

See below, "Descryption for a system log"

image

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 16, 2023

Hi @gfd2020. The approach to integrate into Sleuthkit would need to be done into its official project, exactly to avoid we having to maintain a more complex fork than we have today and to apply patches on future TSK versions. But we have no clue if a PR to official TSK would be accepted or not. I have sent some simple ones to fix critical bugs that took many months to be accepted, after I complained a lot about 0 feedback... I agree this ideal path would be harder. Maybe we could ask/open an issue in TSK asking if they have interest in this feature, if you think this path is worth and if you still have the needed effort/time available, I leave the decision to you.

Honestly, I think it's unlikely that they will allow integration (having seen your report of minor bug fixes).
If you find the second option of not integrating with sleuthkit viable, I will unfreeze this PR.

I think we are in 4.12, not sure. But there is no need to implement WFS support now, don't worry. I just sent the links because maybe they could help. Seems it's a different file system and supporting it, if desired, can be left as a future improvement.

Thank you. I think it will be of great help for future integration

@lfcnassif
Copy link
Member

If you find the second option of not integrating with sleuthkit viable, I will unfreeze this PR.

Sure! Integrating directly into IPED is a reasonable option. Let me know when you finish, then I'll try to review after reviewing some older PRs after I return back. Thank you for contributing again!

@paulobreim
Copy link

This document you sent is what I have and the program is exactly what is in item 2.2 System Logs. The document says that the Logs area has the following information: 0x260 offset system log (00 32 D1 03 00 00 00 00) 0x268 system log size (00 2C F4 00 00 00 00 00) From the beginning of the log area there is the sequence (52 41 54 53 01 00 00 00) for each LOG, but I found this same sequence in areas different from that described in the documentation, which is why the program scans the entire disk. In the image I used in 2020, all of these Logs were valid, so they did not match the documentation. Unfortunately I no longer have the image of a hikvision to be able to do other tests.

Hi @paulobreim . I think I couldn't explain it to you. I am able to read all the logs, types and created time. However, what is not clear is the text inside the log itself. The field that starts after the log type. Some random text and a bunch of hexadecimals appear, probably flags.

See below, "Descryption for a system log"

image

After the log type,there are the ip address, if the event is 3. I don´t remember in that time, to see other messages.
When i get some hikvision HD I will check it.

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Sep 18, 2023

This document you sent is what I have and the program is exactly what is in item 2.2 System Logs. The document says that the Logs area has the following information: 0x260 offset system log (00 32 D1 03 00 00 00 00) 0x268 system log size (00 2C F4 00 00 00 00 00) From the beginning of the log area there is the sequence (52 41 54 53 01 00 00 00) for each LOG, but I found this same sequence in areas different from that described in the documentation, which is why the program scans the entire disk. In the image I used in 2020, all of these Logs were valid, so they did not match the documentation. Unfortunately I no longer have the image of a hikvision to be able to do other tests.

Hi @paulobreim . I think I couldn't explain it to you. I am able to read all the logs, types and created time. However, what is not clear is the text inside the log itself. The field that starts after the log type. Some random text and a bunch of hexadecimals appear, probably flags.
See below, "Descryption for a system log"
image

After the log type,there are the ip address, if the event is 3. I don´t remember in that time, to see other messages. When i get some hikvision HD I will check it.

I found the information we need.
https://github.com/theAtropos4n6/HikvisionLogAnalyzer

@lfcnassif
Copy link
Member

Hi @gfd2020! Is this ready for review or do you plan to do more changes?

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Oct 29, 2023

Hi @gfd2020! Is this ready for review or do you plan to do more changes?

hi @lfcnassif . Yes, I need help transforming the log information into a new text file.
Is there a parser that does something similar?

@lfcnassif
Copy link
Member

hi @lfcnassif . Yes, I need help transforming the log information into a new text file.
Is there a parser that does something similar?

Hi @gfd2020! Not sure if I understood, but I think you have 2 options:

  1. Generate a txt, html or csv preview of the file (log), like the VCardParser or KnownMetParser. To work, you should configure the parser supported mimetype into conf/MakePreviewConfig.txt;
  2. Create a subitem in txt, html, csv or any other format from your parser.

@gfd2020 gfd2020 marked this pull request as ready for review November 17, 2023 23:45
@patrickdalla
Copy link
Collaborator

Hi, @gfd2020? Nassif asked me to test/review this. Could you or @paulobreim share some sample hikvision images?

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Jan 5, 2024

Hi, @gfd2020? Nassif asked me to test/review this. Could you or @paulobreim share some sample hikvision images?

Hi @patrickdalla . I think it will be a bit complicated, the image I have is 2TB compressed...
Maybe @paulobreim has a smaller one ...

@patrickdalla
Copy link
Collaborator

patrickdalla commented Jan 5, 2024 via email

@paulobreim
Copy link

@gfd2020, Your idea regarding HD Hikvision is to recover images and logs from a HD that has been formatted or HD that has not been formatted?

@gfd2020
Copy link
Collaborator Author

gfd2020 commented Jan 6, 2024

@gfd2020, Your idea regarding HD Hikvision is to recover images and logs from a HD that has been formatted or HD that has not been formatted?

It just shows the videos that are in the proprietary file system. It is not the objective of the PR to create recovery of deleted data, perhaps as future work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants