Unit 3.9-3.11 Developing Algorithms and Binary Search Notes
- Unit 3.9-3.11 Developing Algorithms Java
- Review
- Conditionals vs Booleans
- Challenge
- Unit 3.9-3.11 Developing Algorithms Python
- Unit 3.9-3.11 Searching Introduction
sunny = true;
rainy = false;
if (sunny) {
umbrella = false;
} else if (rainy) {
umbrella = true;
} else {
umbrella = false;
}
console.log(umbrella);
The code above is the same as below:
umbrella = !sunny && rainy;
console.log(umbrella);
To determine if two conditionals and booleans are the same, you can substitute the four possibilities that the two booleans (sunny and rainy) can be (listed below) into the conditional and boolean and see if both cases match:
sunny = true
, rainy = true
sunny = true
, rainy = false
sunny = false
, rainy = true
sunny = false
, rainy = false
Challenge
Using JavaScript, create an algorithm that takes in an IP address and a subnet mask and computes the network address.
Overview
As we've seen in Unit 4.1, an IP address is a 32 bit number that uniquely identifies each device. (See this for a recap). Something extra is that an IP address also comes with a subnet mask. A subnet mask is also a 32 bit number that identifies what network an IP address in in through a process that uses the bitwise AND.
In ANDing:
0 + 0 = 0
0 + 1 = 0
1 + 0 = 0
1 + 1 = 1
The following are the steps to determine the network that an IP address is in given the subnet mask:
Example: IP address: 192.168.0.1
Subnet mask: 255.255.255.0
- Convert the IP address into binary: 192.168.0.1 -> 11000000.10101000.00000000.00000001
- Convert the subnet mask into binary: 255.255.255.0 -> 11111111.11111111.11111111.00000000
- Do a bitwise AND operation on the binary IP address and subnet mask:
11000000.10101000.00000000.00000001
+11111111.11111111.11111111.00000000
=11000000.10101000.00000000.00000000
- Convert the result back to decimal: 11000000.10101000.00000000.00000000 -> 192.168.0.0
The network address is 192.168.0.0
Algorithms can be written in different ways and still accomplish the same tasks. Algorithms that look similar often yield differnet outputs. To solve the same problem, many different algorithms can be used.
Therefore, algorithms are very important for programmers, and today we're going to explore how to determine the outcome of algorithms, how to deteremine the output of similar algorithms, how to edit existing algorithms, and how to develop our own algorithms.
Consider the following algorithm.
def mystery(num, num2):
if (num % num2 == 0):
print("True")
else:
print("False")
mystery(20, 3)
- What does the algorithm do? Please explain in words. The algorithm would produce a boolean output true if the mystery numbers can divide into a whole number and false if it can't.
- What if I put in 30 as num and 4 as num2. What would be the output? The output would be false since 30 is not divisible by 4.
What is the output of this algorithm? If the variable "temp" is greater than 90 (degrees), it would print "it is too hot outside" and since the temp is = to 95, that's what it will print. If it is not greater then 90, it will go into another if else loop that determines what to print whether or not temp is >= 65 or not.
temp = 95
if (temp >= 90):
print("it is too hot outside")
else:
if (temp >= 65):
print("I will go outside")
else:
print("it is too cold outside")
What is the output of this algorithm? it looks similar but the output is different! Instead of just printing the output if temp is > 90, it will also run the ifs that aren't in an else loop which would therefore print a different output.
temp = 95
if (temp >= 90):
print("it is too hot outside")
if (temp >= 65):
print("i will go outside")
if (temp < 65):
print("it is too cold outside")
To develop algorithms, we first need to understand what the question is asking. Then, think about how you would approach it as a human and then try to find what pattern you went through to arrive at the answer. Apply this to code, and there you have it! An algorithm!
Let's say you wanted to sum up the first five integers. How would you do this in real life? Your thought process would probably be:
- The sum of the first integer is 1.
- Then, increase that integer by 1. I now add 2 to my existing sum (1). My new sum is 3.
- Repeat until I add 5 to my sum. The resulting sum is 15.
Now let's translate this into code.
sum = 0 # start with a sum of 0
for i in range (1, 6): # you will repeat the process five times for integers 1-5
sum = sum + i # add the number to your sum
print(sum) # print the result
Task: Write an algorithm in python that sums up the first 5 odd integers. You can use the following code as a starter.
sum = 0
counter = 1
for i in range (0, 5):
sum = sum + counter
counter = counter + 2
print(sum)
Homework
Create an algorithm that will start with any positive integer n and display the full sequence of numbers that result from the Collatz Conjecture. The COllatz Conjecture is as follows:
- start with any positive integer
- if the number is even, divide by 2
- if the number is odd, multiply by 3 and add 1
- repeat steps 2 and 3 until you reach 1
Example: if the starting number was 6, the output would be 6, 3, 10, 5, 16, 8, 4, 2, 1
What is searching?
In certain computer programs and applications, one might find the need to locate and retrieve a data value and/or it's index. Searching algorithms could be done in either intervals or sequences, and certain algorithms could be more efficient than others, with benefits and drawbacks to each.
The Naive Approach
The most intuitively obvious solution to the searching problem is to sequentialy check each successful value in the data structure until either a matching value is found, or the entire structure has been transversed. This thought process could be explained graphically in this example
def sequentialSearch(arr, target):
N = len(arr) # Declare N as length of array
for i in range(N): # Iterate over the list
if arr[i] == target: # Check for match
return i # Match found, return index and end function call
return -1 # Element not found
Sequential Search - Larger Inputs
Although for selection sort is seemingly fast for smaller inputs, it is clear that it cannot keep up with increasing input sizes. Because sequential search checks every value of the given array, the algorithm's overall runtime increases "linearly" with the input size.
i.e. Pretend that one check takes one second, and that we are searching for the last element in an array. If the array length is 4, it would take 4 seconds to find the last element, whereas if the array length is 86400 indices long, it would take a whole day to find the element.
Hence, although selection sort is known for its simplicity, it is unfeasible for large inputs
Below, we have created three sorted lists of length 100,1000000,100000000.
import time
arr1 = [i for i in range(100)]
arr2 = [i for i in range(1000000)]
arr3 = [i for i in range(100000000)]
To analyze the sequential search algorithm, we will check for the worst case scenario, where runtime is maximized. This is because when measuring the efficiency of our algorithm, we want to be able to guarantee an upper limit or set amount of time for our program to finish running. To do this, we will attempt to search for the last element in the array
print("length of list: ", len(arr1))
s = time.time()
print("Index: ", sequentialSearch(arr1,99))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
print("length of list: ", len(arr2))
s = time.time()
print("Index: ", sequentialSearch(arr2,999999))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
print("length of list: ", len(arr3))
s = time.time()
print("Index: ", sequentialSearch(arr3,99999999))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
As you can see, as the input list grows larger and larger, the overall runtime of the program increases linearly as well, resulting in a lower scalability for the sequential search algorithm.
Binary Search
Binary search is an efficient way to iterate through a SORTED list to find a requested value. This is done through checking the middle value of a list and checking if the requested value is greater than or less than the middle value. You can start to see why the requested list must be sorted. If the list is not sorted, this logic is flawed, and the binary search algorithm will no longer work.
Unlike the sequential search method, binary search doesn't check for each successive element until a match is found. In every iteration the algorithm is making a binary decision; if the selected element is larger or smaller than the target.
How exactly does this work? Lets look at these amazing ms paint drawings:
This algorithm is extremely efficient as the maximum number of cycles in binary search is equal to log base 2 of the closest, next power of two, to length of list.
If the array is 8 items long, the maximum possible cycles would be 3 (log base 2 of 8 is 3)
If the array is 7 items long, the maximum possible cycles would STILL be 3 as the closest power of 2 to 7 is 8.
If the array is 9 items long, the maximum possible cycles INCREASES to 4, as the closest, next power of two, is 16.
def binarySearch(array, target): # Parameters: array is the given array and target is what we are looking for
low = 0 # the starting lower bound
high = len(array)-1 # the starting upper bound
while high >= low: # we will keep running until we run out of possible subarrays...
mid = (high + low) // 2 # define the middle of the list to be the item at the index of the average of the lower and upper bound
if array[mid] == target: # if item is in the middle of the list... we found what we are looking for!
return mid # therefore, we return the index of where we found the item.
elif array[mid] > target: # if item is less than the middle of the list, this must mean that the item is on the lower half of the list
high = mid-1 # therefore, we set the upper bound of the search to be the last item of the lower half
else: # if item is neither less than or equal to the middle of the list, this must mean that the item is on the upper half of the list
low = mid+1 # therefore, we set the lower bound of the search to be the first item of the upper half
# if nothing is returned by the time the while loop ends, that means item MUST be missing from list
return False # therefore we tell the user that the requested item was not found
Likewise, we can also take a recursive approach to this problem, note the similarities
def BinarySearchRecursion(arr, target, lo, hi):
if lo > hi:
return False
mid = (lo+hi)//2
if arr[mid] == target:
return mid
elif arr[mid] > target:
return BinarySearchRecursion(arr, target, lo, mid-1)
elif arr[mid] < target:
return BinarySearchRecursion(arr, target, mid+1, hi)
Now, let's compare the runtime with the same lists we tried with the sequential search algorithm.
print("length of list: ", len(arr1))
s = time.time()
print("Index: ", binarySearch(arr1,99))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
print("length of list: ", len(arr2))
s = time.time()
print("Index: ", binarySearch(arr2,999999))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
print("length of list: ", len(arr3))
s = time.time()
print("Index: ", binarySearch(arr3,99999999))
e = time.time()
print('Execution time:', (e-s)*1000, 'ms')
arr4 = [1,4,5,8,10,13,145,1938]
print("Index: ", binarySearch(arr4,17))
Challenges and Homework
You have one homework problem.
Yes just one.
Don't get excited though.
Problem: Given a specific integer N, return the square root of N (R) if N is a perfect square, otherwise, return the square root of N rounded down to the nearest integer
Input: N (Integer)
Output: R (Integer)
Constraints: Do not use any built-in math operations such as sqrt(x)
or x**(0.5)
, Try complete the problem in logarithmic time.
Hint 1: Maybe you can use Binary Search to try and reduce the number of checks you have to perform?
Hint 2: Is there a mathematical pattern amongst numbers and their square roots that could help you reduce the number of searches or iterations you must execute? Is there some value or rule you can set before applying binary search to narrow the range of possible values?
Run the very last code segment below to load test cases and submission function
from math import sqrt as sq
def sqrt(N: int) -> int:
low = 0
high = N
while low <= high:
mid = (low + high) // 2
if mid * mid <= N:
low = mid + 1
else:
high = mid - 1
return low - 1
Explaining code: The sqrt() function defined in this code uses a binary search algorithm to calculate the square root of a number N. It does this by first initializing two variables, low and high, to 0 and N. It then enters a loop that continues until low is greater than high. In each iteration of the loop, the mid variable is calculated as the average of low and high, rounded down to the nearest integer using integer division. The if statement in the loop then checks if the square of mid is less than or equal to N. If this is the case, low is set to mid + 1, and the loop continues. Otherwise, high is set to mid - 1, and the loop continues. When the loop ends, the function returns low - 1 as the square root of N
from math import sqrt as sq
test_cases = [0,1,4,85248289,22297284,18939904,91107025,69122596,9721924,37810201,1893294144,8722812816,644398225]
answers = [int(sq(x)) for x in test_cases]
def checkValid():
for i in range(len(test_cases)):
if sqrt(test_cases[i]) == answers[i]:
print("Check number {} passed".format(i+1))
else:
print("Check number {} failed".format(i+1))
checkValid()