Scaling and Benchmarking Self-Supervised Visual Representation Learning (2)

336x280(권장), 300x250(권장), 250x250, 200x200 크기의 광고 코드만 넣을 수 있습니다.

2019/12/01 - [AI/논문정리] - Scaling and Benchmarking Self-Supervised Visual Representation Learning (1) 의 이어지는 글이다.

Scaling and Benchmarking Self-Supervised Visual Representation Learning

Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra

(Submitted on 3 May 2019 (v1), last revised 6 Jun 2019 (this version, v2))

arXiv:1905.01235

성능 평가(benchmarking)를 위해서 위의 9가지 datasets과 task를 이용한다.

저자들은 SSL을 initialization method보다는 feature representation을 배우는 과정이라고 보기 때문에 이를 위해서

feature들의 fine-tuning을 제한적으로 실행한다.

대부분 평가과정은 다음과 같이 이뤄진다.

1. YFCC나 ImageNet에 대해 self-supervised pretext method를 pre-train 시킨다.

2. 다양한 layer로부터 feature들을 뽑아낸다.

3. 뽑아낸 여러 feature들을 위의 9가지 datasets과 tasks에 대해 transfer learning을 하면서 평가한다.

Image classification을 수행한 결과이다. (Places205, VOC07, COCO2014에 대해서 수행.)

self-supervised 방식과 supervised 방식 사이의 상당한 차이가 있는 것을 확인할 수 있다.

적은 이미지데이터를 가지고 진행했을 때이다.

가장 좋은 결과가 나온 res4의 feature들을 보면 그래도 supervised와의 차이가 상당한 것을 확인할 수 있다.

Visual Navigation 을 테스트해보기 위해 이미지 스트림을 agent에 주고 미리 설정된 location으로 navigate하도록 agent를 훈련시킨다.

res3에서 뽑아낸 feature들을 이용하면 supervised일 때보다 훨씬 더 높은 training reward를 얻을 수 있고 더 좋은 샘플효과를 얻을 수 있었다.

또한 YFCC에 pre-train한 것 보다 ImageNet에 pre-train한 것이 더 효과가 좋았다.

Object detection에도 transfer learning을 했다.

Detection을 위해서 Detectron을 사용하는데 이는 AlexNet을 지원하지 않기 때문에

ResNet 만 평가하도록 한다.

Fast R-CNN의 모든 conv 바디를 얼려놓고 오직 RoI head만 학습시킨다.

Self-supervised initialization은 supervised 버전과도 나름 비교가 될 만큼 성능이 괜찮았다.

마지막 평가로는 NYUv2 dataset을 이용한 Surface Normal Estimation이었다.

결과적으로는 3D geometric task를 위해 self-supervised method가 supervised보다 조금 더 좋은 feature를 제공했다.

Legacy Task에 대해서는 따로 설명하지 않겠다.

결론적으로 데이터 사이즈에 따라 transfer performance는 log-linear하게 증가했다.

이 논문을 통해서 self-supervision을 scaling하는 것은 중요하지만 아직까지도 supervised pre-training을 뛰어넘으려면 한참 남았다는 것을 알게되었다.

혹시 SSL pretext task를 만들게되면 facebook에서 제공하는 benchmark suite를 통해 평가해볼 수 있다.

https://sites.google.com/view/fb-ssl-challenge-iccv19/home

fai_ssl_challenge

This challenge is meant to evaluate self-supervised representations only. This means that the representations are trained without any human/semantic labels. For example, pre-training on ImageNet with labels is NOT a self-supervised method. Similary, pre-tr

sites.google.com

저작자표시 비영리 변경금지

Self-supervised Domain Adaptation for Computer Vision Tasks (2) (0)	2019.12.11
Self-supervised Domain Adaptation for Computer Vision Tasks (1) (0)	2019.12.11
Scaling and Benchmarking Self-Supervised Visual Representation Learning (1) (0)	2019.12.01
Tile2Vec: Unsupervised representation learning forspatially distributed data (0)	2019.11.19
Revisiting Self-Supervised Visual Representation Learning (0)	2019.11.12

Self-supervised Domain Adaptation for Computer Vision Tasks (2) (0)

2019.12.11

Self-supervised Domain Adaptation for Computer Vision Tasks (1) (0)

2019.12.11

Scaling and Benchmarking Self-Supervised Visual Representation Learning (1) (0)

2019.12.01

Tile2Vec: Unsupervised representation learning forspatially distributed data (0)

2019.11.19

Revisiting Self-Supervised Visual Representation Learning (0)

2019.11.12

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

From the bottom

Scaling and Benchmarking Self-Supervised Visual Representation Learning (2)

AI 논문정리카테고리의 다른글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

AI논문정리카테고리의 다른글

검색

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

AI 논문정리카테고리의 다른글