compSec Lecture10
View on GitHub | Download Local
Extracted Content (for search)
Click to view slide text
CS 4173/5173 COMPUTER SECURITY The Length of Hash
OUTLINE LAST TIME Hash functions:
- Performance: Easy to compute H(m)
- One-way property: Given H(m) but not m, it’s computationally infeasible to find m
- Weak collision resistance: Given H(m), it’s computationally infeasible to find m’ such that H(m’) = H(m).
- Strong collision resistance: computationally infeasible to find m1, m2 such that H(m1) = H(m2)
2
EXERCISE I (SECRET DEAL) • A and B have a financial agreement on something.
‒ They want to keep the agreement confidential, no third-party should know the content of the agreement. ‒ But they also want to have a third-party to certify the agreement, just in case…
3
EXERCISE II (A NEW HASH?) • Bob designed a very fast hash function, which is to XOR every 128 bits in a message. 1st 128bits
2nd 128bits
……
nst 128bits
Message
XOR all 128-bit blocks 128-bit hash
4 What is the flaw?
EXERCISE III (FILE MANAGEMENT) • A file system contains millions of data files. • Design a scheme to detect if a given file exists in the system. – i.e., find out if there already exists a file that has the same content with the given file.
• Specify your design – Cost in terms of storage and computation – Efficiency in file finding and comparison 5
ESSENCE IN HASH-BASED SECURE DESIGN • Make sure the input has a very large value space so it is computationally difficult to do exhaust search. ‒ Use salt ‒ Use random padding if necessary ‒ ……
6
HASH FUNCTION WITH SHORT OUTPUT
Message m of arbitrary length
Hash H(.)
2-bit H(m)
Does this hash function have the strong collision resistance property? Why?
7
LENGTH OF HASH Message m of arbitrary length
No key here! Hash H(.)
A fixed-length short message H(m)
• Question
‒ Why do we have 128 bits,160 bits, 256 bits in the output of a hash function? ‒ If it is too long • Unnecessary overhead
‒ If it is too short
• Loss of strong collision free property
8
BIRTHDAY PARADOX: SIMPLE QUESTION • Question:
‒ What is the probability that two persons have same birthday? ‒ (Equivalent question: what is the probability the hashes of two different inputs are the same?) • Assume 365 days a year, and all birthdays are equally likely
‒ Probability is 1 - 365 * (365-1)/(365*365).
9
BIRTHDAY PARADOX: GENERAL QUESTION • Question:
‒ What is the probability that at least two persons among k persons have the same birthday? • Assume 365 days a year, and all birthdays are equally likely
‒ Probability is 1 – 365*(365-1)(365-2)…*(365-k+1)/365k = 1 - 365!/(365-k)!/365k
10
BIRTHDAY PARADOX • Question:
‒ What is the probability that at least two persons among k persons have the same birthday? • Assume 365 days a year, and all birthdays are equally likely • Probability is 1 - 365!/(365-k)!/365k • The larger k, the larger the probability.
• New Question: Find the values of k such that the probability is greater than 0.5.
‒ Why 0.5? Just an example to show when the value of k could make the collision probability non-negligible • i.e., collision is really likely to happen.
‒ 1 - 365!/(365-k)!/365k >= 0.5 k >= 23. 11
BIRTHDAY PARADOX: GENERALIZATION • Generalization of birthday paradox ‒Given
• random variable with uniform distribution between 1 and n • n=365 in the birthday paradox
• a selection of k random variables • k persons in birthday paradox
‒What is the least value of k such that
• There will be at least two variables with the same value with probability P(n,k) > 0.5 ?
12
BIRTHDAY PARADOX: GENERALIZATION • Generalization of birthday paradox
‒P(n,k) = 1-{n!/(n-k)!nk } ≈ 1 – e-k*(k-1)/2n ‒For large n and k, to have P(n,k) > 0.5 with the smallest k, we have k = 2(ln2)n = 1.18 n ≈ n
‒Example
• Find the values of k such that the probability that two persons have the same birthday is greater than 0.5. (n=365) • 1.18*(365)1/2 = 22.54 13
BIRTHDAY PARADOX (CONT’D) • Implication for hash function H of bit length m
• The hash value of an arbitrary input message is randomly distributed between 1 and 2m, i.e., (n=2m)
message m1 message m2
…
…
Hash H(.)
Two hashes have the same value
message mk – What is the least value of k such that
• If we hash k messages, the probability that at least two of them have the same hash (i.e., a collision) is larger than 0.5.
k ≈ n = 2m = 2m / 2 14 Design objective: make sure k is very large so an attacker sees the computational difficulty
BIRTHDAY PARADOX (CONT’D) • Typical hash length: 128 (MD5), 160 bits (SHA-1), 256 (SHA-256), 512 (SHA-512) m
k ≈ n = 2 =2
m/2
• 264 =18446744073709551616 • 280 =1208925819614629174706176
• It takes 38,334,786 years for a computer with 1 billion hashes per second.
15
SUMMARY • Design of Hash should balance the tradeoff between overhead and security ‒ If it is too long
• Unnecessary overhead
‒ If it is too short
• Loss of strong collision free property • Birthday paradox
• MD5: 128 bits – not secure any more • SHA1: 160 bits – not (very) secure any more • SHA256: 256 bits • SHA384, SHA512, … 16
CS 4173/5173 COMPUTER SECURITY The MD5 Hash Function
MD5: MESSAGE DIGEST VERSION 5 • MD5 at a glance Message of arbitrary length MD5 (multiple passes)
128-bit message digest 18
PROCESSING OF A SINGLE BLOCK 512-bit message block (sixteen 32-bit words) 128-bit input message digest (four 32-bit words)
MD5
128-bit output message digest (four 32-bit words) the bytes of each 32-bit word are stored in little-endian order (LSB to MSB)
1 word = 4 bytes = 32 bits 19
MD5: A HIGH-LEVEL VIEW Padding (1 to 512 bits)
K bits Message 512 bits
128 bits
Init Value
100…0
512 bits
Y0
…
Y1 128 bits
128 bits
MD5
MD5
stage 1
stage 2
Message Length (64 bits)
…
512 bits
YL-1 128 bits
MD5
128-bit digest
stage L 20
PADDING • To original message M, add padding bits “10…0”
‒ enough 0’s so that resulting total length is 64 bits less than a multiple of 512 bits
• Append L (the original length of M), represented in 64 bits, to the padded message
100……………0 L in binary format 21
INITIAL VALUES • The four 32-bit words of the output (the digest) are referred to as d0, d1, d2, d3 ‒ Initial values (in little-endian order) ‒ d0 = 0x67452301, d1 = 0xEFCDAB89 ‒ d2 = 0x98BADCFE, d3 = 0x10325476 Y0 128 bits
Init Value
…
Y1 128 bits
128 bits
MD5
MD5
stage 1
stage 2
…
YL-1 128 bits
MD5
128-bit digest
stage L 22
INSIDE AN MD5 BLOCK • Bits operations: fully specified: ‒ AND, OR, XOR, bit shifting, …
Block of message (512 bits) Previous hash Or d0, d1, d2, d3 (128 bits)
MD5
New hash (128 bits)
23
(IN)SECURITY OF MD5 • A few methods can find collisions in a few hours
‒ Birthday attack: need hash 264 messages to find a collision with probability 0.5.
• Recall from birthday paradox, we need to compute 2m/2 hashes to find two with the same value with probability 0.5. • For MD5, m = 128 bits, so brute-force attack needs 2128/2 =264.
‒ A few collisions were published in 2004 ‒ Can find many collisions for 1024-bit messages.
24
CS 4173/5173 COMPUTER SECURITY The SHA-1 Hash Function
SECURE HASH ALGORITHM (SHA) • Developed by NIST, specified in the Secure Hash Standard, 1993 • SHA is specified as the hash algorithm in the Digital Signature Standard (DSS) • SHA-1: revised (1995) version of SHA
26
SHA-1 PARAMETERS • Input message must be < 264 bits • Input message is processed in 512-bit blocks, with the same padding as MD5 • Message digest output is 160 bits long
‒ Referred to as five 32-bit words A, B, C, D, E ‒ IV: A = 0x67452301, B = 0xEFCDAB89, C = 0x98BADCFE, D = 0x10325476, E = 0xC3D2E1F0 (big-endian order)
Y0
Y1
YL-1
160 bits
Init Value A. B, C, D, E
160 bits
SHA1 stage 1
A. B, C, D, E
160 bits
SHA1 stage 2
…
A. B, C, D, E
160 bits A. B, C, D, E
SHA1 stage L
160-bit digest 27
PREPROCESSING OF A BLOCK • Let 512-bit block be denoted as sixteen 32-bit words W0..W15 • Preprocess W0..W15 to derive an additional sixty-four 32-bit words W16..W79, as follows: Note: << is for 16 ≤ t ≤ 79 “circular shift” Wt = (Wt-16 ⊕ Wt-14 ⊕ Wt-8 ⊕ Wt-3) << 1 Y0
A, B, C, D, E
Y1
Y = W0..W15 SHA1
YL-1
Preprocess inside 28
SHA1 BLOCK PROCESSING • Consists of 80 steps! (vs. 64 for MD5) • Inputs for each step 0 ≤ t ≤ 79: ‒ Wt ‒ Kt – a constant ‒ A,B,C,D,E: current values to this point
• Outputs for each step: ‒ A,B,C,D,E : new values
29
PROCESSING PER STEP • Everything to right of “=” is input value to this step (i.e., old values) for t = 0 to 79
A = E + (A << 5) + Wt + Kt + f(t,B,C,D)
B = A
C = B << 30 D = C end
E = D Note: << is “circular shift”
30
FIGURE FOR ONE STEP
from wiki http://en.wikipedia.org/wiki/SHA-1 31
CONSTANTS KT • Only 4 values (represented in 32 bits), derived from 230 * i1/2, for i = 2, 3, 5, 10 ‒ for 0 ≤ t ≤ 19: Kt = 0x5A827999 ‒ for 20 ≤ t ≤ 39: Kt = 0x6ED9EBA1 ‒ for 40 ≤ t ≤ 59: Kt = 0x8F1BBCDC ‒ for 60 ≤ t ≤ 79: Kt = 0xCA62C1D6
32
FUNCTION F(T,B,C,D) • 3 different functions are used in SHA-1 processing Round
Function f(t,B,C,D)
0 ≤ t ≤ 19
(B∧C) ∨ (~B∧D)
20 ≤ t ≤ 39
B⊕C⊕D
40 ≤ t ≤ 59
(B∧C) ∨ (B∧D) ∨ (C∧D)
60 ≤ t ≤ 79
B⊕C⊕D
• ∧ - AND • ∨ - OR 33
COMPARISON: SHA-1 VS. MD5 • SHA-1 is a stronger algorithm
‒ brute-force attacks require on the order of 280 operations vs. 264 for MD5
• SHA-1 is about twice as expensive to compute • Both MD-5 and SHA-1 are much faster to compute than DES
34
SECURITY OF SHA-1 • SHA-1
‒ Still more secure than MD5
• SHA-1 has 160 bits, so 2160/2 =280 for brute-force attacks • MD5 has 128 bits, so 2128/2 =264 for brute-force attacks, and much less for other smart attacks.
35
FINDING SHA-1 COLLISION • https://security.googleblog.com/2017/02/announcing-firstsha1-collision.html • In 2013, Marc Stevens published a paper that outlined a theoretical approach to create a SHA-1 collision. • In 2017, first collision announced by Google:
‒ Nine quintillion (9,223,372,036,854,775,808/263) SHA1 computations in total ‒ 6,500 years of CPU computation to complete the attack first phase ‒ 110 years of GPU computation to complete the second phase
36
MODERN HASH FUNCTIONS
• MD6, SHA-256, SHA-384, SHA-512… • https://shattered.io/
37
USING HASH FUNCTIONS IN LINUX • MD5
‒ Command: md5sum [filename]
• Compute the MD5 hash of the content of a file
‒ Command: echo “[content]” | md5sum • Compute the MD5 hash of [content]
• SHA1
‒ Command: sha1sum [filename]
• SHA256
‒ Command: sha256sum [filename]
38
SUMMARY OF HASH • Four important properties • Applications
‒ Detecting file change ‒ Storing password ‒ Message authentication, HMAC ‒ …
• Hash Functions:
‒ MD5: 128 bits – not secure at all ‒ SHA1: 160 bits – not very secure ‒ SHA256, SHA384, SHA512 – secure for now ‒…
39
APPLICATION • Alice wants to prove to Bob that she is indeed Alice
‒ Alice and Bob shares a common secret key ‒ But the communication between them is on a public channel. ‒ What Alice should do to prove to Bob that she is indeed Alice without actually disclosing the key itself?
40
APPLICATION: USER AUTHENTICATION • Alice wants to authenticate herself to Bob ‒ assuming they already share a secret key K
• Protocol:
Alice
Bob picks random number R
computes Y=H(R|K) time
verifies that Y=H(R|K)
What should be the range of random number R?
41
CS 4173/5173 COMPUTER SECURITY High Level Introduction to Public Key Cryptography
COMPANY A’S PROBLEM I • Company A is a big web service company with over 10,000 employees. • The president Bob wants to make sure that all employees can verify the authenticity of the announcement emails that he sends. • Q: How to ensure authenticity of these emails.
43
COMPANY A’S PROBLEM II • Company A is accepting vulnerability report of their web system from the public. • They need a design that someone can successfully send the report of a potential vulnerability via email to them. • Q: How to ensure the confidentiality of reports?
44
HOW TO SECURELY SURF INTERNET
A D C B
E
Share the same key before access?
45
PUBLIC KEY CRYPTOGRAPHY plaintext
encryption Public key
ciphertext
different!
decryption
plaintext
Private key
• Invented and published in 1975 • A public / private key pair is used • Also known as asymmetric cryptography • Much slower to compute than secret key cryptography 46
PUBLIC/PRIVATE KEY • Public key – encrypt • Private key – decrypt
• How does the secret communication look like?
Alice
Bob
wants to send a message
47
PUBLIC/PRIVATE KEY • Alice has her own public and private key pair • Bob also has his own public and private key pair • Public key
‒ Can be released to the public
• Private Key
‒ Must be kept secret.
48
AUTHENTICATION IN PUBLIC KEY CRYPTO • Message integrity with digital signatures • Alice computes hash, signs with her private key (no one else can do this without her key) • Bob verifies hash on receipt using Alice’s public key using the verification equation
Plaintext
Alice Signs
Plaintext with digital signature
Alice’s Private Key
Bob Verifies Signature
Valid / Not Valid
Alice’s Public Key 49
AUTHENTICATION (CONT’D) • Authentication in public key crypto:
‒ Hash function to hash the message into a digest ‒ The action of sign the digest with (private key) ‒ The action of verify the digest with (public key)
50
PUBLIC-KEY REQUIREMENTS • It must be computationally
‒ easy to generate a public / private key pair ‒ hard to determine the private key, given the public key
• It must be computationally
‒ easy to encrypt using the public key ‒ easy to decrypt using the private key ‒ hard to recover the plaintext message from just the ciphertext and the public key
51
PUBLIC KEY ALGORITHMS • Public key algorithms covered in this class, and their applications System
Encryption / Decryption?
Digital Signatures?
Key Exchange?
RSA
Yes
Yes
Yes
DiffieHellman DSA
Yes Yes
52
SOLVING COMPANY A’S PROBLEM I • Company A is a big web service company with over 10,000 employees. • The president Bob wants to make sure that all employees can verify the authenticity of the announcement emails that he sends. • Answer:
‒ Everyone knows Bob’s public key. ‒ Bob signs the email using his private key. ‒ Everyone can verify the signed email using Bob’s public key
53
SOLVING COMPANY A’S PROBLEM II • Company A is accepting vulnerability report of their web system from the public. • They need a design that someone can successfully send the report of a potential vulnerability via email to them. • Answer:
‒ Company A generates a key pair, then releases the public key to the public for vulnerability report. ‒ Everyone uses the public key to encrypt the report.
54
PUBLIC KEY VS. SYMMETRIC KEY Symmetric key
Public key
Two parties MUST trust each other
Two parties DO NOT need to trust each other
Both share same key
Two separate keys: a public and a private key
Typically faster
Typically slower
Examples:
Examples:
DES, RC5, AES, …
RSA, DSA, ECC…
55
COMPANY A’S PROBLEM III • Company A provides an on-line chat service for vulnerability report. ‒ Requirement 1: confidentiality. ‒ Requirement 2: efficiency because there will be a number of message exchanges.
• Q: How to satisfy both requirements?
56
DIGITAL ENVELOPE: SYMMETRIC+ASYMMETRIC
- Generate a secret key (called a session key) at random.
- Encrypt the message using the session key and symmetric algorithm.
- Encrypt the session key with the recipient’s public key. This becomes the “digital envelope”.
- Send the encrypted message and the digital envelope to the recipient.
57
DIGITAL ENVELOPE (CONT’D) Alice (finds a vulnerability)
Bob (company representative)
Generate a session key Ks
Knows the public key: Kp
Encrypt message M using Ks Encrypted Ks using Kp
Send a response using Ks All remaining communication is based on Ks
Public key: Kp Private key: Ki
- Use Ki to decrypt Ks
- Use Ks to decrypt M
58