ShazamKit is a recent Apple Framework announced during WWDC 2021, that brings audio matching capabilities within your app. You can make any prerecorded audio recognizable by building your own custom catalogue using audio from podcasts, videos, and more or match music to the millions of songs in Shazam's vast catalogue.
Today, we are going to build a simple music-matching recognizer. The idea is to build a component that is independent of the UI framework being used (SwiftUI or UIKit).
We will create a Swift
class named creatively ShazamRecognizer
that will have some simple tasks to perform:
- Create the properties that are going to help us in building our class
- Request Permission to record audio using the
AVFoundation
framework - Start Recording and Send the record to
ShazamKit
for recognition matching - Handle the response from ShazamKit (Success when a match was found or Error when no match was found)
- Display our result in a UI (e.g: SwiftUI or UIKit)
Create the properties that are going to help us in building our class
final class ShazamRecognizer: NSObject, ObservableObject {
// 1. Audio Engine
private let audioEngine = AVAudioEngine()
// 2. Shazam Engine
private let shazamSession = SHSession()
// 3. UI state purpose
@Published private(set) var isRecording = false
// 4. Success Case
@Published private(set) var matchedTrack: ShazamTrack?
// 5. Failure Case
@Published var error: ErrorAlert? = nil
}
In the above declarations, we have:
- We create the
audioEngine
which is used tostart
andstop
the recording - We create the
shazamSession
which is used to perform the matching process - We use
isRecording
to track whether or not there is an ongoing recording operation - We create a variable of custom type
ShazamTrack
to store our result in case of success - In case of failure, we store the error in the
error
variable of typeErrorAlert
Request Permission to record audio using the AVFoundation framework
func listenToMusic() {
// 1.
let audioSession = AVAudioSession.sharedInstance()
// 2.
audioSession.requestRecordPermission { status in
if status {
// 3.
self.recordAudio()
} else {
// 4.
self.error = ErrorAlert("Please Allow Microphone Access !!!")
}
}
}
Start Recording and Send the record to ShazamKit for recognition
private func recordAudio() {
// 1. If the `audioEngine` is running, stop it and return
if audioEngine.isRunning {
self.stopAudioRecording()
return
}
// 2. Create a inputNode to listen to
let inputNode = audioEngine.inputNode
let format = inputNode.outputFormat(forBus: .zero)
// 3. Removes the tap if already installed
inputNode.removeTap(onBus: .zero)
// 4. Install an audio tap on the bus
inputNode.installTap(onBus: .zero,
bufferSize: 1024,
format: format)
{ buffer, time in
self.shazamSession.matchStreamingBuffer(buffer, at: time)
}
audioEngine.prepare()
do {
try audioEngine.start()
DispatchQueue.main.async {
self.recording = true
}
} catch {
self.error = ErrorAlert(error.localizedDescription)
}
}
private func stopAudioRecording() {
audioEngine.stop()
isRecording = false
}
Handle the response from ShazamKit
override init() {
super.init()
shazamSession.delegate = self
}
extension ShazamRecognizer: SHSessionDelegate {
func session(_ session: SHSession, didFind match: SHMatch) {
DispatchQueue.main.async {
if let firstItem = match.mediaItems.first {
self.matchedTrack = ShazamTrack(firstItem)
self.stopAudioRecording()
}
}
}
func session(_ session: SHSession, didNotFindMatchFor signature: SHSignature, error: Error?) {
DispatchQueue.main.async {
self.error = ErrorAlert(error?.localizedDescription ?? "No Match found!")
self.stopAudioRecording()
}
}
}
At this point, We are pretty much done with our audio recognition system!👏🏻💪🏼. We are ready to use it in our application, no matter the UI framework.
For this example, I've used SwiftUI
for a quick prototype, but you can use UIKit
as well without any particular effort.
You can find the full demo project
Conclusion
ShazamKit
framework has a lot to offer, but in this article, we have just scratched the tip of the iceberg, I hope you have learned something today :)