Introduction
Iconik gives users the ability to integrate with Artificial Intelligence/Machine Learning services to enrich their media archive. Iconik comes with our own Facial Recognition service and we support integration with third-party transcription, facial recognition, object detection, and summarization and topic extraction services. All use of AI/ML services are optional and customer administrators of the Iconik domain can choose to allow access to these features on a per-user level.
Iconik comes with some services configured by default but customers can choose to disable these and use their own accounts with the supported AI services.
Face Recognition
Iconik comes with a built in Face Recognition service for videos and images which allows customers to build a catalog of relevant persons in their own content. Our Face Recognition service is split into two steps: detection and recognition. The detection step uses a generic model which is not trained on our customers' content to detect the presence and location of faces in images and videos. The recognition step uses the data from the detection step and compares it to other faces within the same Iconik domain in order to detect if this is an already know person or a completely new one. This step separates customer content by customer and no data is shared between different customers.
Transcription
Transcription services use the audio track of a video or audio file to create a textual transcript. Out of the box, Iconik uses Rev AI as our transcription engine but Amazon Rekognition can also be used.
When a user sends a video file to the transcription service we extract the audio track and pass only the audio over to the service. The transcription service temporarily stores the audio file to be able to run the analysis. Once the transcription job has completed the file is deleted from the transcription service.
When using the Rev AI transcription engine it is also possible to configure Summarization and Topic Extraction. When this functionality is used the output of the transcription engine is used to create a short summary and a list of topics covered in the media file.
The resulting transcription output is stored in Iconik's database, associated with the media asset and is deleted either when a user manually deletes the transcription track, or when the asset itself is deleted.
Analysis
Iconik supports the Amazon Rekognition and Google Vision and Video AI services for Facial Recognition and Object Detection. As with transcription, customers can choose if they want to use the Iconik managed services or bring their own accounts.
For image and video analysis we send the proxy file to the analysis service. This proxy file is smaller in size than the original file but contains the equivalent information and is therefore suitable to use as a substitute in the analysis process. The analysis service keeps a copy of the file for the duration of the analysis and the file and associated metadata is deleted once the analysis job is completed.
Training
Iconik, and the services we use, do not use any customer content to train any AI models. The Face Recognition feature does use previously detected faces within one Iconik Domain to compare it to new faces within the same Iconik Domain. No data is shared between customers.