파이썬 기초와 Ipython, jupyter notebook

Python

파이썬 기초와 Ipython, jupyter notebook

haventmetyou 2023. 10. 31. 17:42

C:\Users\Desktop\project\basic>python hello.py
hello.py
__main__

C:\Users\Desktop\project\basic>ipython
In [1]: %run hello.py
hello.py
__main__

In [2]: an_apple = 27

In [3]: an_example = 42

an 입력하고 tap 키 누르면 자동 완성 기능 제공

자기관찰(인트로스펙션 introspection)

In [3]: b = [1, 2, 3]

In [4]: b?
Type:        list
String form: [1, 2, 3]
Length:      3
Docstring:
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

In [5]: print?
Signature: print(*args, sep=' ', end='\n', file=None, flush=False)
Docstring:
Prints the values to a stream, or to sys.stdout by default.

sep
  string inserted between values, default a space.
end
  string appended after the last value, default a newline.
file
  a file-like object (stream); defaults to the current sys.stdout.
flush
  whether to forcibly flush the stream.
Type:      builtin_function_or_method

변수 이름 앞이나 뒤에 물음표를 붙이면 그 객체에 대한 일반 정보 출력

In [7]: def add_numbers(a, b):
   ...:     """
   ...: Add two numbers together
   ...: Returns
   ...: ------
   ...: the sumL type of arguments
   ...:     """
   ...:     return a + b
   ...:

In [8]: add_numbers?
Signature: add_numbers(a, b)
Docstring:
Add two numbers together
Returns
------
the sumL type of arguments

File:      c:\users\desktop\project\basic\<ipython-input-7-004085b61198>
Type:      function

별표(*)로 문자열을 둘러싸게 되면 해당 문자열이 포함된 모든 이름을 보여 줌

In [9]: import numpy as np

In [10]: np.*load*?
np.__loader__
np.load
np.loadtxt

In [11]: a = b

In [12]: a.append(4)

In [13]: a
Out[13]: [1, 2, 3, 4]

In [14]: b
Out[14]: [1, 2, 3, 4]

변수에 값을 할당하는 것은 한 이름이 하나의 객체로 연결되므로 바인딩이라고 부름

값이 할당된 변수 이름은 때때로 종속 변수라고 부르기도 함

In [15]: def append_element(some_list, element):
    ...:     some_list.append(element)
    ...:
In [16]: data = [1, 2, 3]

In [17]: append_element(data, 4)

In [18]: data
Out[18]: [1, 2, 3, 4]

In [23]: a = 4.5

In [24]: b = 2

In [25]: # 문자열 출력 형식을 지정한다. 나중에 자세히 살펴본다

In [26]: print('a is {}, b is {}'.format(type(a), type(b)))
a is <class 'float'>, b is <class 'int'>

모듈 import

C:\Users\Desktop\project\basic>copy con some_module.py
# some_module.py
PI = 3.14159

def f(x):
    return x + 2

def g(a, b):
    return a + b
^Z    # 코드 입력 후 ctrl + Z
        1개 파일이 복사되었습니다.

In [2]: import some_module

In [3]: result = some_module.f(5)

In [4]:

In [4]: pi = some_module.PI

In [5]: pi
Out[5]: 3.14159

In [6]: # 또는

In [7]: from some_module import g, PI

In [8]: result = g(5, PI)

In [9]: result
Out[9]: 8.14159

문자열

In [33]: animals = '''
    ...: cat
    ...: dog
    ...: fish
    ...: '''

In [34]: animals
Out[34]: '\ncat\ndog\nfish\n'

In [35]: animals.count('\n')
Out[35]: 4

In [36]: a = 'this is a string'

In [37]: a[10] = 'f'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[37], line 1
----> 1 a[10] = 'f'

TypeError: 'str' object does not support item assignment

파이썬의 문자열은 변경이 불가능

> replace 같은 메서드를 사용해 변경된 새로운 문자열을 생성해야 함

In [38]: b = a.replace('string', 'longer string')

In [39]: b
Out[39]: 'this is a longer string'

In [40]: a
Out[40]: 'this is a string'

작업 이후 변수 a는 변경되지 않음

역슬래시(\)는 이스케이프 문자로 개행 문자 \n이나 유니코드 문자 같은 특수한 목적의 문자를 나타내기 위해 사용

역슬래시를 나타내려면 역슬래시 자체를 이스케이프해야 함

In [54]: s = '12\34'

In [55]: s
Out[55]: '12\x1c'

In [56]: s = '12\\34'

In [57]: s
Out[57]: '12\\34'

특수 문자 없이 역슬래시가 많이 포함된 문자열 나타낼 때, 혹은 특수문자를 그대로 나타내도록 할 때

In [52]: s = r'this\has\no\special\characters'

In [53]: s
Out[53]: 'this\\has\\no\\special\\characters'

코드 앞에 r을 붙임

In [58]: template = '{0:.2f} {1:s} are worth US${2:d}'

{0:.2f}는 첫 번째 인수를 소수점 아래 두 자리까지만 표시하는 부동소수점 형태로 출력하라는 의미

{1:s}는 두 번째 인수를 문자열로 포맷하라는 의미

{2:d}는 세 번째 인수를 정수로 포맷하라는 의미

In [59]: template.format(88.46, 'Argentine Pesos', 1)
Out[59]: '88.46 Argentine Pesos are worth US$1'

포맷 매개변수를 통해 대치하고 싶은 인수를 format 메서드에 전달

파이썬 3.6부터 문자열 포맷을 편리하게 지정할 수 있는 f-string 기능 추가됨

f-string을 만들려면 문자열을 감싸는 따옴표 앞에 f를 붙이면 됨

In [60]: amount = 10

In [61]: rate = 88.46

In [62]: currency = 'Pesos'

In [63]: result = f'{amount} {currency} is worth US${amount / rate}'

In [64]: f'{amount} {currency} is worth US${amount / rate:.2f}'
Out[64]: '10 Pesos is worth US$0.11'

위 문자열 템플릿과 같은 방법으로 각 표현식 뒤에 포맷 지정 가능

정렬

In [34]: a = [7, 2, 5, 1, 3]

In [35]: a.sort()

In [36]: a
Out[36]: [1, 2, 3, 5, 7]

In [37]: b = ['saw', 'small', 'He', 'foxes', 'six']

In [38]: b.sort(key=len)

In [39]: b
Out[39]: ['He', 'saw', 'six', 'small', 'foxes']

sort는 몇 가지 옵션을 제공하는데 그중 하나는 사용할 값을 반환하는 함수

슬라이싱

# 원하는 크기만큼 자르기
In [40]: seq = [7, 2, 3, 7, 5, 6, 0, 1]

In [41]: seq[1:5]
Out[41]: [2, 3, 7, 5]

# 다른 순차 자료형을 대입할 수 O
In [42]: seq[3:5] = [6, 3]

In [43]: seq
Out[43]: [7, 2, 3, 6, 3, 6, 0, 1]

색인의 시작(start) 위치에 있는 값은 포함, 끝(stop)은 포함 x

슬라이싱 결과의 개수는 stop - start

In [44]: seq[-4:]
Out[44]: [3, 6, 0, 1]

In [45]: seq[-6:-2]
Out[45]: [3, 6, 3, 6]

음수 색인은 순차 자료형의 끝에서부터 위치를 나타냄

In [46]: seq[::2]
Out[46]: [7, 3, 3, 0]

In [47]: seq[::-1]
Out[47]: [1, 0, 6, 3, 6, 3, 2, 7]

두 번째 콜론 다음에 간격(step) 지정할 수 있음

값으로 -1을 사용하면 리스트나 튜플을 역순으로 반환

딕셔너리

In [49]: empty_dict = {}

In [50]: d1 = {'a': 'some value', 'b': [1, 2, 3, 4]}

In [51]: d1
Out[51]: {'a': 'some value', 'b': [1, 2, 3, 4]}

In [52]: d1[7] = 'an integer'

In [53]: d1
Out[53]: {'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [54]: d1['b']
Out[54]: [1, 2, 3, 4]

In [55]: 'b' in d1
Out[55]: True

In [56]: d1[[10, 20]] = [100, 200]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[56], line 1
----> 1 d1[[10, 20]] = [100, 200]

TypeError: unhashable type: 'list'

다른 프로그래밍 언어에서는 해시 맵 또는 연관 배열(associative array)로 알려져 있음 키-값 쌍을 저장

리스트는 키로 사용할 수 없음

del 예약어나 pop 메서드(값을 반환함과 동시에 해당 키 삭제)를 통해 딕셔너리 값을 삭제할 수 있

In [57]: d1[5] = 'some value'

In [58]: d1
Out[58]: {'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value'}

In [59]: d1['dummy'] = 'another value'

In [60]: d1
Out[60]:
{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 5: 'some value',
 'dummy': 'another value'}

In [61]: del d1[5]

In [62]: d1
Out[62]:
{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 'dummy': 'another value'}

In [63]: ret = d1.pop('dummy')

In [64]: ret
Out[64]: 'another value'

In [65]: d1
Out[65]: {'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

keys와 values 메서드는 각각 키와 값이 담긴 이터레이터 반환, 키의 순서는 삽입 순서에 따라 다름

이러한 함수는 키와 값을 각각 동일한 순서로 출력

In [66]: list(d1.keys())
Out[66]: ['a', 'b', 7]

In [67]: list(d1.values())
Out[67]: ['some value', [1, 2, 3, 4], 'an integer']

키와 값에 대해 반복 작업을 해야 하는 경우 items 메서드를 사용하면 키-값 쌍을 갖는 튜플로 사용할 수 있음

In [68]: list(d1.items())
Out[68]: [('a', 'some value'), ('b', [1, 2, 3, 4]), (7, 'an integer')]

update 메서드를 사용하면 하나의 딕셔너리를 다른 딕셔너리와 합칠 수 있음

In [69]: d1.update({'b': 'foo', 'c': 12})

In [70]: d1
Out[70]: {'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

이미 존재하는 키에 대해 update를 호출하면 이전 값은 사라짐

순차 자료형에서 딕셔너리 생성

In [71]: tuples = zip(range(5), reversed(range(5)))

In [72]: tuples
Out[72]: <zip at 0x2287a0e2780>

In [73]: mapping = dict(tuples)

In [74]: mapping
Out[74]: {0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

본질적으로 딕셔너리는 두 개짜리 튜플로 구성됨

dict 함수가 두 개짜리 튜플의 리스트를 인수로 받아 딕셔너리를 생성

(딕셔너리가 만들어지는 과정)

기본값

In [75]: words = ['apple', 'bat', 'bar', 'atom', 'book']

In [76]: by_letter = {}

In [77]: for word in words:
    ...:     letter = word[0]
    ...:     if letter not in by_letter:
    ...:         by_letter[letter] = [word]
    ...:     else:
    ...:         by_letter[letter].append(word)
    ...:

In [78]: by_letter
Out[78]: {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

setdefault 메서드 사용 목적

In [79]: by_letter = {}

In [80]: for word in words:
    ...:     letter = word[0]
    ...:     by_letter.setdefault(letter, []).append(word)
    ...:

In [81]: by_letter
Out[81]: {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

내장 collections 모듈에는 위 과정을 좀 더 쉽게 만드는 defaultdict 클래스가 있음

자료형 혹은 딕셔너리의 각 슬롯에 담길 기본값을 생성하는 함수를 넘겨 딕셔너리 생성

In [85]: from collections import defaultdict

In [86]: for word in words:
    ...:     by_letter[word[0]].append(word)
    ...:

In [87]: by_letter
Out[87]: defaultdict(list, {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']})

In [91]: keys = ['a', 'b', 'c', 'd']

In [92]: x = dict.fromkeys(keys, 0)

In [93]: x
Out[93]: {'a': 0, 'b': 0, 'c': 0, 'd': 0}

In [94]: x['z']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[94], line 1
----> 1 x['z']

KeyError: 'z'

In [95]: from collections import defaultdict

In [96]: y = defaultdict(int)

In [97]: y['z']
Out[97]: 0

In [98]: int()
Out[98]: 0

In [99]: z = defaultdict(lambda: 'python')

In [100]: z['a']
Out[100]: 'python'

In [101]: z[0]
Out[101]: 'python'

유효한 딕셔너리 키

딕셔너리의 값으로는 어떤 파이썬 객체든 가능하지만 키는 스칼라 자료형(정수, 실수, 문자열)이나 튜플(튜플에 저장된 값 또한 바뀌지 않는 객체여야 함)처럼 값이 바뀌지 않는 객체만 가능

기술적으로는 해시가 가능(hashability)해야 함, 어떤 객체가 해시가 가능한지는(딕셔너리의 키로 사용할 수 있는지) hash 함수를 사용해 검사할 수 있음

In [103]: hash('string')
Out[103]: -3594434936365532107

In [104]: hash((1, 2, (2, 3)))
Out[104]: -9209053662355515447

In [105]: hash((1, 2, [2, 3]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[105], line 1
----> 1 hash((1, 2, [2, 3]))

TypeError: unhashable type: 'list'

비가역으로 해시 > 원래 넣었던 객체는 불가능, 길이는 모두 같음

집합

# 집합
In [106]: set([2, 2, 2, 1, 3, 3])
Out[106]: {1, 2, 3}

In [107]: a = {1, 2, 3, 4, 5}

In [108]: b = {3, 4, 5, 6, 7, 8}

# 합집합
In [109]: a.union(b)
Out[109]: {1, 2, 3, 4, 5, 6, 7, 8}

In [110]: a|b
Out[110]: {1, 2, 3, 4, 5, 6, 7, 8}

# 교집합
In [111]: a.intersection(b)
Out[111]: {3, 4, 5}

In [112]: a & b
Out[112]: {3, 4, 5}

# 부분집합, 상위집합
In [113]: a_set = {1, 2, 3, 4, 5}

In [114]: {1, 2, 3}.issubset(a_set)
Out[114]: True

In [115]: a_set.issuperset({1, 2, 3})
Out[115]: True

# 집합의 내용이 같다면 두 집합은 동일(순서가 없기 때문)
# 리스트나 튜플은 False
In [116]: {1, 2, 3} == {3, 2, 1}
Out[116]: True

딕셔너리도 순서가 없기 때문에 셸 창에 {1: 'cat', 2: 'dog'} == {2: 'dog', 1: 'cat'} 입력했을 때 True 출력

내장 순차 자료형 함수

1. enumerate

순차 자료형에서 현재 아이템의 index를 함께 추적할 때 사용

In [117]: a = [38, 21, 53, 62, 19]

In [118]: for i, v in enumerate(a):
     ...:     print(i, v)
     ...:
0 38
1 21
2 53
3 62
4 19

2. sorted

정렬된 새로운 순차 자료형 반환

In [119]: sorted([7, 1, 2, 6, 0, 3, 2])
Out[119]: [0, 1, 2, 2, 3, 6, 7]

In [120]: sorted('horse race')
Out[120]: [' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

In [194]:  a = [7, 1, 2, 6, 0, 3, 2]

In [195]: sorted(a)
Out[195]: [0, 1, 2, 2, 3, 6, 7]

In [196]: a
Out[196]: [7, 1, 2, 6, 0, 3, 2]

In [197]: a.sort()

In [198]: a
Out[198]: [0, 1, 2, 2, 3, 6, 7]

3. zip

여러 개의 리스트나 튜플 또는 다른 순차 자료형을 서로 짝지어 튜플 리스트 생성

In [126]: seq1 = ['foo', 'bar', 'baz']

In [127]: seq2 = ['one', 'two', 'three']

In [128]: zipped = zip(seq1, seq2)

In [129]: list(zipped)
Out[129]: [('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

In [130]: seq3 = [False, True]

In [131]: list(zip(seq1, seq2, seq3))
Out[131]: [('foo', 'one', False), ('bar', 'two', True)]

In [132]: for index, (a, b) in enumerate(zip(seq1, seq2)):
     ...:     print(f'{index}: {a}, {b}')
     ...:
0: foo, one
1: bar, two
2: baz, three

In [133]: a
Out[133]: 'baz'

In [134]: b
Out[134]: 'three'

리스트, 집합, 딕셔너리 표기

리스트 표기법

In [135]: strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [136]: [x.upper() for x in strings if len(x) > 2]
Out[136]: ['BAT', 'CAR', 'DOVE', 'PYTHON']

필터 조건 생략 가능, 문자열 리스트가 있다면 위 코드처럼 문자열의 길이가 2 이하인 문자열은 제외하고 나머지를 대문자로 변환 가능

집합과 딕셔너리도 동일한 방식 적용 가능

리스트 내 문자열들의 길이를 담고 있는 집합을 생성하려면 집합 표기법을 이용해 다음과 같이 처리

In [141]: unique_lengths = {len(x) for x in strings}

In [142]: unique_lengths
Out[142]: {1, 2, 3, 4, 6}

map 함수를 이용해 함수적으로도 표현 가능

In [143]: set(map(len, strings))
Out[143]: {1, 2, 3, 4, 6}

딕셔너리 표기법 예제: 리스트에서 문자열의 위치를 담고 있는 딕셔너리 생성

In [144]: loc_mapping = {value: index for index, value in enumerate(strings)}

In [145]: loc_mapping
Out[145]: {'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

중첩된 리스트 표기법

각 이름에서 알파벳 a가 두 개 이상 포함된 이름의 리스트 구한다고 했을 때 다음과 같은 반복문으로 리스트를 구할 수 있음

In [146]: all_data = [['John', 'Emily', 'Michael', 'Mary', 'Stevev'], ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

In [147]: names_of_interest = []

In [148]: for names in all_data:
     ...:     enough_as = [name for name in names if name.count('a') >= 2]
     ...:     names_of_interest.extend(enough_as)
     ...:

In [149]: names_of_interest
Out[149]: ['Maria', 'Natalia']

위 코드 전체를 중첩된 리스트 표기법을 이용해 한 번에 구현할 수 있음

In [151]: result = [name for names in all_data for name in names
     ...:             if name.count('a') >= 2]

In [152]:

In [152]: result
Out[152]: ['Maria', 'Natalia']

리스트 표기법에서 for 부분은 중첩의 순서에 따라 나열되며 필터 조건은 끝에 위치함

숫자 튜플이 담긴 리스트를 단순한 리스트로 변환하는 예제

In [153]: some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [154]: flattened = [x for tup in some_tuples for x in tup]

In [155]: flattened
Out[155]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

함수

In [157]: def my_function(x, y):
     ...:     return x + y
     ...:

In [158]: my_function(1, 2)
Out[158]: 3

In [159]: result = my_function(1, 2)

In [160]: result
Out[160]: 3

함수 블록이 끝날 때까지 return 문이 없다면 자동으로 None이 반환됨

In [161]: def function_without_return(x):
     ...:     print(x)
     ...:

In [162]: result = function_without_return('hello!')
hello!

In [163]: print(result)
None

각 함수는 여러 개의 위치 인수와 키워드 인수를 받을 수 있음

키워드 인수는 기본값이나 선택적인 인수로 흔히 사용됨

In [164]: def my_function2(x, y, z=1.5):
     ...:     if z > 1:
     ...:         return z * (x + y)
     ...:     else:
     ...:         return z / (x + y)
     ...:

In [165]: my_function2(5, 6, z=0.7)
Out[165]: 0.06363636363636363

In [166]: my_function2(3.14, 7, z=3.5)
Out[166]: 35.49

In [167]: my_function2(10, 20)
Out[167]: 45.0

키워드 인수는 선택 사항이지만 함수를 호출할 때 위치 인수는 반드시 지정해야 함

함수의 키워드 인수는 항상 위치 인수 다음에 와야 한다는 규칙이 있음

네임스페이스, 스코프, 지역함수

함수는 전역(global)과 지역(local), 두 가지 스코프(scope)에서 변수를 참조

변수의 스코프를 설명하는 다른 용어로 네임스페이스가 있음, 함수에서 선언된 변수는 기본적으로 모두 지역 네임스페이스에 속함

In [168]: def func():
     ...:     a = []
     ...:     for i in range(5):
     ...:         a.append(i)
     ...:

In [169]: def func():
     ...:     a = []
     ...:     for i in range(5):
     ...:         a.append(i)
     ...:     print(a)
     ...:

In [170]: func()
[0, 1, 2, 3, 4]

In [171]: print(a)
baz

In [172]: a = []

In [173]: def func():
     ...:     for i in range(5):
     ...:         a.append(i)
     ...:

In [174]: func()

In [175]: a
Out[175]: [0, 1, 2, 3, 4]

In [176]: func()

In [177]: a
Out[177]: [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

In [178]: a = None

In [179]: def bind_a_variable():
     ...:     global a
     ...:     a = []
     ...: bind_a_variable()

In [180]:

In [180]: print(a)
[]

In [181]: print(a.append('apple'))
None

In [182]: a
Out[182]: ['apple']

global 예약어의 사용은 권장되지 않음

여러 값 반환

In [183]: def f():
     ...:     a = 5
     ...:     b = 6
     ...:     c = 7
     ...:     return a, b, c
     ...:

In [184]: a, b, c = f()

In [185]: f()
Out[185]: (5, 6, 7)

In [187]: x
Out[187]: {'a': 0, 'b': 0, 'c': 0, 'd': 0}

In [188]: x, y, z = (5, 6, 7)

In [189]: x
Out[189]: 5

In [190]: def f():
     ...:     a = 5
     ...:     b = 6
     ...:     c = 7
     ...:     return {'a': a, 'b': b, 'c': c}
     ...:

In [191]: result = f()

In [192]: result
Out[192]: {'a': 5, 'b': 6, 'c': 7}

경우에 따라 딕셔너리를 반환하는 방식이 더 유용할 수 있음

저작자표시