Dependencies and its purpose
It is very helpful to understand dependency injection in depth in order to fully grasp it. Dependency injection is quite an intriguing topic, and to fully understand it, you must learn the basics. Therefore, this will be a two-part explanation.
What dependency injection is, how it benefits us, are described in this article. So let’s get started…
How dependencies work??
To create and inject dependencies, most dependency injectors use reflection. While reflection is great, it is very slow and time-consuming. Additionally, it performs dependency resolution at runtime, which can lead to unexpected errors and crashes.
On the other hand, some injectors use a Pre-compiler that generates all the classes (object graph) they need by using an Annotation Processor. As part of the build process, an annotation processor reads compiled files to generate source code files that will be used by the project. So it perform the dependency resolution before the application runs and avoid unexpected errors. Yes it is too much to process, bare with us it will get simpler.
Is it necessary??
You are probably asking why we need dependencies, since we are using the S2 platform, which has the libraries pre-compiled! Yes, you have a point, but not quite. The field of data science is vast, and as technology advances, tools for advancement and ease of use are also increasing. We also create our own custom libraries that have predefined functionality. It is therefore impossible to track every tool. Our goal today is to learn how to inject dependencies on the S2 platform.
Libraries in Data science
This course is primarily about data science, so we will focus simply on the libraries that are used for data science tools, so we will demonstrate two libraries: Weka and Klaxon
WEKA
Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization. So weka is an important tool for us.
Before moving forward let me tell you weka is pre compiled by us, you just need to use %use weka and that’s it, this is just for demonstration purpose.
import weka.classifiers.*

Due to the fact that weka has not been compiled, we are encountering an error when importing it. We will now look at how to inject the Weka library on the S2 platform. There is a central repository on the internet called mvnrepository where we can find all the dependencies we want. Link to the webiste https://mvnrepository.com/

All you have to do is type in the library that you want to inject, in our case weka, and hit enter and select the desire verion.

As you can view we have multiple development tools like Maven, Gradle Grape etc etc. As S2 is kotlin friendly, choose Gradle(Kotlin). Both the lines under this is helpful to us for injecting this dependencies.
@file:Repository("https://mvnrepository.com/artifact/nz.ac.waikato.cms.weka/weka-stable")
@file:DependsOn("nz.ac.waikato.cms.weka:weka-stable:3.8.0")
Gradle resolves dependencies under the covers from the URL of the public repository defined by the shorthand notation which is used in @file:Repository(“..”) and @file:DependsOn(“..”) intuitively means, when executing tasks, Gradle ensures that all task dependencies are honored, so that the task is executed after its dependencies and any “must run after” tasks have been completed.
This is how we successfully injected and imported weka
KLAXON
Klaxon is basically a JSON parser library. It is not necessary while we are learning data science we will only encounter csv format data to perform operation, most of the data (mostly for natural laungauge processing) is of JSON format and parsing intuitively means analysing a text into logical sytatic component. So we need to follow exact same steps that we folowed for weka.

Whether to choose central or Spring lib release or Jcenter etc., you will stumble upon this. Here is what it is, it is nothing more than the repository you use to store the dependencies and their information. Jcenter, for example, is a publicly accessible repository on Bintray. Select the repository that suits you and the release version you wish to use.

Following the same code snippet for Klaxon as we did for weka, let’s inject and import it.
@file:Repository("https://mvnrepository.com/artifact/com.beust/klaxon")
@file:DependsOn("com.beust:klaxon:5.5")
import com.beust.klaxon.*
We have successfully injected and imported the Klaxon library now let’s parse some data for basic understanding.
import java.io.StringReader
We need to import Java string reader class as our source of data is string and it needs to be changed to stream.
val data_simple = """
{
"name_data": [
{ "name": "Tan", "age": 23 },
{ "name": "May", "age": 22 },
{ "name": "Shi", "age": 19 },
{ "name": "Rose", "age": 12 },
{ "name": "Hak", "age": 52 },
{ "name": "Fay", "age": 83 },
]
}
"""
Now we will pass a simple string data that we want to parse using Klaxon.
data class Names(val name: String, val age: Int)
val klaxon = Klaxon()
val parsed = klaxon.parseJsonObject(StringReader(data_simple))
val dataArray = parsed.array("name_data")
val users = dataArray?.let { klaxon.parseFromJsonArray(it) }
We will now define the data type for what our data points belong to i.e string for name and Int for age. And later use Klaxon to parse the data just as shown in the following snippet and let’s analyse the output.
println(users)

We can access each user name using the index number.
users[1]

If you think it was pretty simple and we may not get such data in real world correct. What if we get a dataset of slack and we are asked to count the user we have and country/continent they belonged to? Damn that;s tough, alright let’s do it.
First we need to acquire data from slack.
(If you have your own workspace on slack you can download the data or else you can’t. Over here for this tutorial I created a demo workspace so that we get a clear overview of real world problem)
You can visit this link to check out how to download the data. https://slack.com/intl/en-in/help/articles/201658943-Export-your-workspace-data

This is how users.json data will look like. It’s scary isn’t it!!!!
Let’s use the knowldege that we acquired in this article and try to count the users using Klaxon and users.json file. First step in create a folder name data and upload users.json folder on S2 platform.
@file:Repository("https://mvnrepository.com/artifact/com.beust/klaxon")
@file:DependsOn("com.beust:klaxon:5.5")
import com.beust.klaxon.*
import java.io.*
data class User(val id: String , val tz: String="undefined")
var users=Klaxon().parseArray(File("data/users.json").readText())
AND THAT’S IT!!!!! CAN’T BELIEVE??? HOLD ON.
users.count()

As this dataset is very small as it was made for demonstration purpose, Let’s check out how many users are from INDIA.
users.filter{it.tz.contains("Asia")}.count()

So here we wind up Library injection and some fun knowledge of parsing using Klaxon. For more information visit https://github.com/cbeust/klaxon