Diary of an IT Guy #AnakBinus

Build Bridges, not Walls. Collaboration Forever!

Archive for the ‘Data Science’ Category

Data Engineering: It is about pipelines!

leave a comment »

Data engineering process handles so many pipelines like oil fields to refineries. So the end product are datas that can be used for data scientists. This is the illustration:

Oil field processing:

Data Engineering process:

Written by isal

5 Juli 2021 at 06:45

Apa itu Machine Learning

leave a comment »

Apa itu Machine Learning atau Pembelajaran Mesin?
Dua definisi Machine Learning ditawarkan:
Arthur Samuel menggambarkannya sebagai: “bidang studi yang memberikan komputer kemampuan untuk belajar tanpa diprogram secara eksplisit.” Ini adalah definisi yang agak kuno dan informal.Tom Mitchell memberikan definisi yang lebih modern: “Suatu program komputer dikatakan belajar (learn) dari pengalaman (Experience) E sehubungan dengan beberapa kelas tugas (Task) T dan ukuran kinerja (Performace) P, — jika kinerjanya pada tugas-tugas di T, yang diukur dengan P, meningkat dengan pengalaman E.Contoh: bermain catur.
E = pengalaman bermain banyak permainan catur
T = tugas bermain catur.
P = probabilitas bahwa program akan memenangkan pertandingan berikutnya.Secara umum, setiap masalah pembelajaran mesin dapat ditugaskan ke salah satu dari dua klasifikasi besar:

  • Pembelajaran terawasi (Supervised learning)
  • Pembelajaran tanpa pengawasan (Unsupervised learning).

Berikut adalah formula penting dari beberapa metode Machine Learning:

Written by isal

23 April 2020 at 19:29

Python: Line Plot Basic (1)

leave a comment »

With matplotlib, you can create a bunch of different plots in Python. The most basic plot is the line plot. A general recipe is given here.

import matplotlib.pyplot as plt

The world bank has estimates of the world population for the years 1950 up to 2100. The years are loaded in your workspace as a list called year, and the corresponding populations as a list called pop.

# Print the last item from year and pop
print (year[-1])
print (pop[-1])

# Import matplotlib.pyplot as plt
import matplotlib.pyplot as plt

# Make a line plot: year on the x-axis, pop on the y-axis
plt.plot (year,pop)

# Display the plot with plt.show()



Written by isal

20 Januari 2019 at 00:13

Ditulis dalam Data Science, Uncategorized

Tagged with ,

Latihan Numpy Array di Python (2019-01-19)

leave a comment »

In the last few exercises you’ve learned everything there is to know about heights and weights of baseball players. Now it’s time to dive into another sport: soccer.

You’ve contacted FIFA for some data and they handed you two lists. The lists are the following:

positions = ['GK', 'M', 'A', 'D', ...]
heights = [191, 184, 185, 180, ...]

Each element in the lists corresponds to a player. The first list, positions, contains strings representing each player’s position. The possible positions are: 'GK' (goalkeeper), 'M' (midfield), 'A' (attack) and 'D' (defense). The second list, heights, contains integers representing the height of the player in cm. The first player in the lists is a goalkeeper and is pretty tall (191 cm).

You’re fairly confident that the median height of goalkeepers is higher than that of other players on the soccer field. Some of your friends don’t believe you, so you are determined to show them using the data you received from FIFA and your newly acquired Python skills.

# heights and positions are available as lists

# Import numpy
import numpy as np

# Convert positions and heights to numpy arrays: np_positions, np_heights
np_positions = np.array(positions)
np_heights = np.array(heights)

# Heights of the goalkeepers: gk_heights
gk_heights = np.array(np_heights[np_positions=='GK'])
#print (gk_heights)

# Heights of the other players: other_heights
other_heights = np.array(np_heights[np_positions !='GK'])
#print (other_heights)

# Print out the median height of goalkeepers. Replace 'None'
print("Median height of goalkeepers: " + str(np.median(gk_heights)))

# Print out the median height of other players. Replace 'None'
print("Median height of other players: " + str(np.median(other_heights)))


Median height of goalkeepers: 188.0
Median height of other players: 181.0

Written by isal

19 Januari 2019 at 23:54

Ditulis dalam Data Science

Tagged with , ,

Top Five Data Science Domain

leave a comment »

Hello world Data Science!

Data Science Domains.pdf

Written by isal

18 Januari 2019 at 19:15

%d blogger menyukai ini: