hash算法(Hash algorithm)

介绍

hash算法,是把任意长度的输入(又叫做预映射pre-image)通过散列算法变换成固定长度的输出,该输出就是散列值,hash值。

hash算法准确的说是一种思想,是算法分类。MD4、MD5、SH1等具有hash算法的特征。

1、输出长度固定

不论输入长度是多少,输出为固定比特或者说字节的数据。

2、多对一

输入无穷可能,输出有限可能,因此存在必然的现象,即不同输入通过hash算法得到相同的值。

可以推出:

(1)hash值相同,原数据可能相同,可能不同。当然,相同数据的hash必然相同
(2)hash无法准确逆推

作用

验证数据的完整性

比如网站提供文件下载功能,一般会提供hash算法和官方对文件计算的hash值。
用户下载后,可以自行通过hash算法对文件计算,与官方的结果核对,保证下载文件与官方文件的一致性。
从而避免下载不完全,或者中间人等技术篡改文件。

排序和快速查找

如果是大批量的,大文本的数据集合,想要快速的查找是否存在某个数据,以及对某个数据进行删除、更新,按照文本序列是非常麻烦和迟缓的。

一种优化思想是对各文本进行hash算法,进一步的可以对hash值排序,查找某数据,可以将该数据计算hash值,然后依次进行查找,如果要进行删除或更新操作,取出匹配到的hash值的隐射文本。

hash破解

1、获取hash算法

两种思路,一是输入数据,然后观察输出是否为已知常用hash算法结果。
二是通过常用的hash算法思路,比如辗转相除法和地址空间,是否可以得到算法过程。

2、获取hash原文

前面提到由于hash算法的特性,无法准确根据hash值逆推原文。

但是在固定语境下,输入具有很强的相似性和关联性,比如设置密码。

预先收集和计算弱密码的原文和hash值,假设获取到密码的hash值,通过查表的方式获取密码原文,再进行下一步测试。

————————

introduce

Hash algorithm is to transform the input of any length (also known as pre mapping pre image) into the output of fixed length through hash algorithm. The output is hash value and hash value.

Hash algorithm is exactly an idea and algorithm classification. MD4, MD5 and sh1 have the characteristics of hash algorithm.

1. Fixed output length

Regardless of the input length, the output is fixed bit or byte data.

2. Many to one

The input is infinite and the output is limited. Therefore, there is an inevitable phenomenon, that is, different inputs get the same value through hash algorithm.

Can launch:

(1) If the hash value is the same, the original data may be the same or different. Of course, the hash of the same data must be the same
(2) Hash cannot be pushed back accurately

effect

Verify data integrity

For example, the website provides file download function, which generally provides hash algorithm and official hash value calculated for files.
After downloading, the user can calculate the file through the hash algorithm and check with the official results to ensure the consistency between the downloaded file and the official file.
So as to avoid incomplete downloading or tampering with files by intermediaries and other technologies.

Sorting and quick find

If it is a data set of large quantities and large text, it is very troublesome and slow to quickly find out whether there is a certain data, delete and update a certain data according to the text sequence.

An optimization idea is to hash each text. Further, you can sort the hash value and find a data. You can calculate the hash value of the data, and then find it in turn. If you want to delete or update, take out the implicit text of the matched hash value.

hash破解

1. Get hash algorithm

There are two ideas. One is to input data and then observe whether the output is the result of known commonly used hash algorithms.
Second, whether the algorithm process can be obtained through the commonly used hash algorithm ideas, such as rolling division and address space.

2. Get the original hash

As mentioned earlier, due to the characteristics of hash algorithm, it is impossible to accurately deduce the original text according to the hash value.

However, in a fixed context, input has strong similarity and relevance, such as setting password.

Collect and calculate the original text and hash value of the weak password in advance. Assuming that the hash value of the password is obtained, obtain the original text of the password by looking up the table, and then carry out the next test.