본격적으로 tflite - ios tutorial 을 해보자.

본 포스트는 다음의 공식 tutorial code와 설명을 참고하여 만들어졌다. tflite-ios tutorial github link
2편까지는 공식 튜토리얼을 따라하며 감을 익혀보려한다.
튜토리얼을 원할하게 진행하기 위해서 해당 레포를 clone하고 시작하면 좋을 것 같다.
tflite에 대한 설명은 나중에 하려 한다. 일단 공식 튜토리얼을 진행하여 감을 익히면 좋을 것 같다.

Requirements

Device with iOS 12.0 or above
Xcode 10.0 or above
Valid Apple Developer ID
Xcode command-line tools (run xcode-select --install)
CocoaPods (run bash sudo gem install cocoapods)
Demo app은 iOS device가 필요하다. 그리고 iOS device의 camera를 사용한다.
물론 iPhone simulator로 빌드는 할 수 있지만, Camera not found error가 뜰 것.

Build and run

자... 이제 본격적으로 tflite 튜토리얼을 빌드해보려 한다. 위에 걸린 github 레포를 clone하고, lite/example/image_classification/ios 로 들어가면, 밑의 그림과 같이 구성된 디렉토리를 볼 수 있을 것 이다. 이전 포스트에서 cocoapod이라는 ios 의존성 관리 툴에 대해서 설명했었는데, 이를 설명한 이유가 여기 나온다. 해당 튜토리얼에서 사용하는 lib들은 cocoapod에 관리되고 있고 프로젝트를 빌드하기전 pod에 의해 관리되고 있는 lib들을 설치해주어야 한다. clone 한 뒤, 열어서 그냥 빌드를 하려하면 그림과 같은 build failed를 마주하게 될 것이다...

그림 1. lite/example/image_classification/ios , 바로 빌드를 한다면?

Install the library

Build를 하기 전, cocoapod을 이용하여 해당 튜토리얼에서 사용 할 lib를 설치해준다.
친절하게도 Podfile이 이미 존재하므로 우리는 간단한 pod 명령어만 입력해주면 된다.
밑의 코드와 같이 프로젝트 디렉토리로 들어가 pod install 을 해주자.
그리고 ImageClassification.xcworkspace (No ImageClassification.xcodeproj) 를 열어 실행시켜주면 된다 [2].

cd /your tflite tutorial path/
pod install

그림 2. lite/example/image_classification/ios 디렉토리에서 pod install

그리고 실행하면 잘 빌드된다..!
밑의 그림은 iPhone simulator에서 빌드한것.
따라서 제대로 작동이 안된다.
빌드가 잘 되는지 확인정도만 하자.

혹시나 안된다면 다음의 것들을 확인해보자...

빌드를 했는데 TensorFlowLite 모듈이 없다고 나오는 경우 [2]
- ImageClassification.xcworkspace 가 아닌 ImageClassification.xcodeproj 로 프로젝트를 연 것이 아닌지 확인해보자.
- 사실 처음에 이 에러가 나와서 당황했는데 [2]를 참고해서 해결했다.
- 외부 라이브러리를 사용하는 경우 xcodeproj가 아닌 xcworkspace로 프로젝트를 실행해야 한다.
  그림 4. No TensorFlowLite...
Development team or Bundle Identifier 에 의한 에러
- ImageClassification을 선택하면 밑의 그림과 같이 project 설정을 선택하는 tab이 나온다.
  - Signing & Capabilities → Team 선택
  - Unique한 bundle identifier 만들기.

Explore the code

lib 설치하고 빌드 잘 되는거 확인해봤으니까 본격적으로 코드를 봐보자. 뭐가 어떻게 이루어지기 보기에 앞서서 pod install 만해서 (lib만 설치한) 빈 프로젝트와 우리가 빌드한 프로젝트의 구조를 봐보자.

확실히 뭐가 많이 구현되어있다. 이거를 처음부터 구현하려하니 눈앞이 깜깜해진다. 일단은 중요하다고 생각되는 부분과 해당 부분이 어떤 기능을 구현 한 것인지 확인하고 넘어가자.

왼쪽이 생성한 빈 프로젝트이다.
오른쪽은 우리가 빌드한 프로젝트이다.
https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/ios/EXPLORE_THE_CODE.md 를 참고하여 해당 문서에서 언급한 부분을 보려한다.

Get camera input

우리가 중심적으로 볼 부분은 Camera Feed와 ModelDataHandler 부분이다.
Demo app의 main view는 viewControllers의 ViewController.swift 내부 viewController class이다.
main view인 viewController class에서 CameraFeedManagerDelegate protocol로부터 camera frame을 처리한다.
frame 단위로 model inference가 일어나는 부분은 didOutput method에 구현되어 있다.
즉, didOutput method에서 프레임을 받아 model inference output까지 얻도록 함수들을 호출을 한다.

// MARK: CameraFeedManagerDelegate Methods
extension ViewController: CameraFeedManagerDelegate {

  func didOutput(pixelBuffer: CVPixelBuffer) {
    let currentTimeMs = Date().timeIntervalSince1970 * 1000
    guard (currentTimeMs - previousInferenceTimeMs) >= delayBetweenInferencesMs else { return }
    previousInferenceTimeMs = currentTimeMs

    // Pass the pixel buffer to TensorFlow Lite to perform inference.
    result = modelDataHandler?.runModel(onFrame: pixelBuffer)

    // Display results by handing off to the InferenceViewController.
    DispatchQueue.main.async {
      let resolution = CGSize(width: CVPixelBufferGetWidth(pixelBuffer), height: CVPixelBufferGetHeight(pixelBuffer))
      self.inferenceViewController?.inferenceResult = self.result
      self.inferenceViewController?.resolution = resolution
      self.inferenceViewController?.tableView.reloadData()
    }
  }

pixelBuffer는 frame buffer라고 보면 된다. ios device의 camera hardware로부터 frame을 받는 부분이라 보면 된다.
modelDataHandler는 classification model 을 handling 하기 위한 부분이다.
classification model 에 ios device로부터 획득한 frame 정보를 넣어주게 된다.
그러면 Tensorflow Lite interpretor instance가 image classification을 수행한다.

ModelDataHandler

실제 deep learning model 을 handling 하는 부분이다.
데이터 전처리, inference, 후처리를 포함한다.
- Inference는 Tensorflw Lite interpreter 객체를 생성하고 성공적으로 inference 하면 Top N 개의 classification 결과를 얻는다
일반적으로 Tensorflow lite의 inference는 다음의 과정을 통하여 진행되는데 이를 매치해보자
Model load
- .tflie 모델을 모델의 실행 그래프가 포함된 상태로 메모리에 올리는 과정
- init에서 model name, path를 받아 interpreter에 넣어주는것을 볼 수있다. init에서 모델을 메모리에 올린다.
데이터 변환
- 모델이 요구하는 포맷으로 데이터를 변환 시켜준다.
Model Inference
결과 해석

Initialization

interpreter 객체를 생성한다.
- interpreter = try Interpreter(modelPath: modelPath, options: options)
- Tflite model을 이용하여 inference를 하기위해서는 Tflite 의 interpreter 객체를 통해서 해야만 한다.
- Tflite interpreter : https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter
main bundle로 부터 classification model과 label file을 불러온다.

// MARK: - Initialization

  /// A failable initializer for `ModelDataHandler`. A new instance is created if the model and
  /// labels files are successfully loaded from the app's main bundle. Default `threadCount` is 1.
  init?(modelFileInfo: FileInfo, labelsFileInfo: FileInfo, threadCount: Int = 1) {
    let modelFilename = modelFileInfo.name

    // Construct the path to the model file.
    guard let modelPath = Bundle.main.path(
      forResource: modelFilename,
      ofType: modelFileInfo.extension
    ) else {
      print("Failed to load the model file with name: \(modelFilename).")
      return nil
    }

    // Specify the options for the `Interpreter`.
    self.threadCount = threadCount
    var options = InterpreterOptions()
    options.threadCount = threadCount
    do {
      // Create the `Interpreter`.
      interpreter = try Interpreter(modelPath: modelPath, options: options)
      // Allocate memory for the model's input `Tensor`s.
      try interpreter.allocateTensors()
    } catch let error {
      print("Failed to create the interpreter with error: \(error.localizedDescription)")
      return nil
    }
    // Load the classes listed in the labels file.
    loadLabels(fileInfo: labelsFileInfo)
  }

Process input

ViewControllers의 ViewController.swift 내부 didOutput method로부터 획득 한 camera output의 CVPixelBuffer를 이용하여 모델을 추론한다.
Input의 형태를 deep learning model의 입력 차원과 같은 형태로 만들어 준다.
- 일반적으로 deep learning model을 입력의 형태 (차원)이 정해져 있다.

Model 이 요구하는 크기로 이미지 (cameara output, CVPixelBuffer) 를 조절해준다.
-thumnailPixelBuffer가 모델 입력에 맞게 크기가 조절된 이미지이다.
- 이미지 center에서 정사각형 형태로 crop하고, model dimension에 알맞게 resize 한다

/**
   Returns thumbnail by cropping pixel buffer to biggest square and scaling the cropped image to
   model dimensions.
   */
  func centerThumbnail(ofSize size: CGSize ) -> CVPixelBuffer? {
		...
		let thumbnailSize = min(imageWidth, imageHeight) //정사각형 형태로 만든다.

		...

이미지의 채널을 맞춰준다. BGRA → RGB format
- Returns: The RGB data representation of the image buffer or nil if the buffer could not be converted.

/// Returns the RGB data representation of the given image buffer with the specified `byteCount`.
  ///
  /// - Parameters
  ///   - buffer: The pixel buffer to convert to RGB data.
  ///   - byteCount: The expected byte count for the RGB data calculated using the values that the
  ///       model was trained on: `batchSize * imageWidth * imageHeight * componentsCount`.
  ///   - isModelQuantized: Whether the model is quantized (i.e. fixed point values rather than
  ///       floating point values).
  /// - Returns: The RGB data representation of the image buffer or `nil` if the buffer could not be
  ///     converted.
  private func rgbDataFromBuffer(
    _ buffer: CVPixelBuffer,
    byteCount: Int,
    isModelQuantized: Bool
  ) -> Data?
{
...

}

전체 코드는 다음과 같다.

func runModel(onFrame pixelBuffer: CVPixelBuffer) -> Result? {

	let sourcePixelFormat = CVPixelBufferGetPixelFormatType(pixelBuffer)
	    assert(sourcePixelFormat == kCVPixelFormatType_32ARGB ||
	             sourcePixelFormat == kCVPixelFormatType_32BGRA ||
	               sourcePixelFormat == kCVPixelFormatType_32RGBA)
	
	
	    let imageChannels = 4
	    assert(imageChannels >= inputChannels)
	
	    // Crops the image to the biggest square in the center and scales it down to model dimensions.
	    let scaledSize = CGSize(width: inputWidth, height: inputHeight)
	    guard let thumbnailPixelBuffer = pixelBuffer.centerThumbnail(ofSize: scaledSize) else {
	      return nil
	    }
	
	    let interval: TimeInterval
	    let outputTensor: Tensor
	    do {
	      let inputTensor = try interpreter.input(at: 0)
	
	      // Remove the alpha component from the image buffer to get the RGB data.
	      guard let rgbData = rgbDataFromBuffer(
	        thumbnailPixelBuffer,
	        byteCount: batchSize * inputWidth * inputHeight * inputChannels,
	        isModelQuantized: inputTensor.dataType == .uInt8
	      ) else {
	        print("Failed to convert the image buffer to RGB data.")
	        return nil
	      }
}

Run inference

init에서 model 메모리에 올리고, Input 맞췄으니 이제 model inference를 할 차례
데이터 interpreter에 넣어주면 끝이다.

func runModel(onFrame pixelBuffer: CVPixelBuffer) -> Result? {
...
	do {
	      ...
	
	      // Copy the RGB data to the input `Tensor`.
	      try interpreter.copy(rgbData, toInputAt: 0)
	
	      // Run inference by invoking the `Interpreter`.
	      let startDate = Date()
	      try interpreter.invoke()
	      interval = Date().timeIntervalSince(startDate) * 1000
	
	      // Get the output `Tensor` to process the inference results.
	      outputTensor = try interpreter.output(at: 0)
	    } catch let error {
	      print("Failed to invoke the interpreter with error: \(error.localizedDescription)")
	      return nil
	    }

Process results

Inference 한 결과를 해석하는 단계
모델이 quantized된 모델이냐 아니냐에 따라 data type이 다르다.
- UInt8 or float32
- UInt8의 경우 다시 float 형태 (0.0 - 1.0) 으로 만들어준다. 확률의 형태.

func runModel(onFrame pixelBuffer: CVPixelBuffer) -> Result? {
		...

		let results: [Float]
    switch outputTensor.dataType {
    case .uInt8:
      guard let quantization = outputTensor.quantizationParameters else {
        print("No results returned because the quantization values for the output tensor are nil.")
        return nil
      }
      let quantizedResults = [UInt8](outputTensor.data)
      results = quantizedResults.map {
        quantization.scale * Float(Int($0) - quantization.zeroPoint)
      }
    case .float32:
      results = [Float32](unsafeData: outputTensor.data) ?? []
    default:
      print("Output tensor data type \(outputTensor.dataType) is unsupported for this example app.")
      return nil
    }

    // Process the results.
    let topNInferences = getTopN(results: results)

    // Return the inference time and inference results.
    return Result(inferenceTime: interval, inferences: topNInferences)

Reference

저작자표시 비영리